Towards Agentic Vision Language Models for Question Answering on Interactive Dashboard

Kartha, Aaryaman Sudhir

Towards Agentic Vision Language Models for Question Answering on Interactive Dashboard

dc.contributor.advisor	Prince, Enamul Hoque
dc.contributor.author	Kartha, Aaryaman Sudhir
dc.date.accessioned	2026-03-10T16:15:40Z
dc.date.available	2026-03-10T16:15:40Z
dc.date.copyright	2025-12-15
dc.date.issued	2026-03-10
dc.date.updated	2026-03-10T16:15:40Z
dc.degree.discipline	Computer Science
dc.degree.level	Master's
dc.degree.name	MSc - Master of Science
dc.description.abstract	Multimodal models, specifically Vision Language Models (VLMs), have shown increasing capabilities in data visualization oriented downstream tasks, achieving performance saturation in shorter intervals of time. Consequently, focus has shifted to assessing their potential towards new frontiers, specifically interactive environments. Various benchmarks center around data visualization question answering tasks on static visualizations, and such rudimentary approaches don’t reflect real world analysis scenarios where vast decision making is required. Dashboards, while being commonplace tools in various industries, have had limited work done into evaluating the capabilities of VLMs to traverse and reason with them. To tackle these limitations, this thesis presents DashboardQA, a novel benchmark for interactive dashboard question answering. Overall, 292 tasks encompassing 405 QA pairs are presented from 5 diverse category types, with 112 carefully chosen dashboards represented. Experimental results show this benchmark is a challenge for various types of VLMs assessed, with the best model achieving 38.69 %.
dc.identifier.uri	https://hdl.handle.net/10315/43616
dc.language	en
dc.rights	Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subject	Computer science
dc.subject.keywords	Natural language processing
dc.subject.keywords	Question answering
dc.subject.keywords	Human Computer Interaction
dc.subject.keywords	Data visualization
dc.subject.keywords	Interactive visualizations
dc.subject.keywords	Dashboards
dc.subject.keywords	Vision language models
dc.title	Towards Agentic Vision Language Models for Question Answering on Interactive Dashboard
dc.type	Electronic Thesis or Dissertation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Kartha_Aaryaman_Sudhir_2025_MSc.pdf
Size:: 28.64 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: license.txt
Size:: 1.87 KB
Format:: Plain Text
Description:

Download

Name:: YorkU_ETDlicense.txt
Size:: 3.39 KB
Format:: Plain Text
Description:

Download

Collections

Computer Science