Towards Agentic Vision Language Models for Question Answering on Interactive Dashboard
| dc.contributor.advisor | Prince, Enamul Hoque | |
| dc.contributor.author | Kartha, Aaryaman Sudhir | |
| dc.date.accessioned | 2026-03-10T16:15:40Z | |
| dc.date.available | 2026-03-10T16:15:40Z | |
| dc.date.copyright | 2025-12-15 | |
| dc.date.issued | 2026-03-10 | |
| dc.date.updated | 2026-03-10T16:15:40Z | |
| dc.degree.discipline | Computer Science | |
| dc.degree.level | Master's | |
| dc.degree.name | MSc - Master of Science | |
| dc.description.abstract | Multimodal models, specifically Vision Language Models (VLMs), have shown increasing capabilities in data visualization oriented downstream tasks, achieving performance saturation in shorter intervals of time. Consequently, focus has shifted to assessing their potential towards new frontiers, specifically interactive environments. Various benchmarks center around data visualization question answering tasks on static visualizations, and such rudimentary approaches don’t reflect real world analysis scenarios where vast decision making is required. Dashboards, while being commonplace tools in various industries, have had limited work done into evaluating the capabilities of VLMs to traverse and reason with them. To tackle these limitations, this thesis presents DashboardQA, a novel benchmark for interactive dashboard question answering. Overall, 292 tasks encompassing 405 QA pairs are presented from 5 diverse category types, with 112 carefully chosen dashboards represented. Experimental results show this benchmark is a challenge for various types of VLMs assessed, with the best model achieving 38.69 %. | |
| dc.identifier.uri | https://hdl.handle.net/10315/43616 | |
| dc.language | en | |
| dc.rights | Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests. | |
| dc.subject | Computer science | |
| dc.subject.keywords | Natural language processing | |
| dc.subject.keywords | Question answering | |
| dc.subject.keywords | Human Computer Interaction | |
| dc.subject.keywords | Data visualization | |
| dc.subject.keywords | Interactive visualizations | |
| dc.subject.keywords | Dashboards | |
| dc.subject.keywords | Vision language models | |
| dc.title | Towards Agentic Vision Language Models for Question Answering on Interactive Dashboard | |
| dc.type | Electronic Thesis or Dissertation |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Kartha_Aaryaman_Sudhir_2025_MSc.pdf
- Size:
- 28.64 MB
- Format:
- Adobe Portable Document Format