Towards Agentic Vision Language Models for Question Answering on Interactive Dashboard

dc.contributor.advisorPrince, Enamul Hoque
dc.contributor.authorKartha, Aaryaman Sudhir
dc.date.accessioned2026-03-10T16:15:40Z
dc.date.available2026-03-10T16:15:40Z
dc.date.copyright2025-12-15
dc.date.issued2026-03-10
dc.date.updated2026-03-10T16:15:40Z
dc.degree.disciplineComputer Science
dc.degree.levelMaster's
dc.degree.nameMSc - Master of Science
dc.description.abstractMultimodal models, specifically Vision Language Models (VLMs), have shown increasing capabilities in data visualization oriented downstream tasks, achieving performance saturation in shorter intervals of time. Consequently, focus has shifted to assessing their potential towards new frontiers, specifically interactive environments. Various benchmarks center around data visualization question answering tasks on static visualizations, and such rudimentary approaches don’t reflect real world analysis scenarios where vast decision making is required. Dashboards, while being commonplace tools in various industries, have had limited work done into evaluating the capabilities of VLMs to traverse and reason with them. To tackle these limitations, this thesis presents DashboardQA, a novel benchmark for interactive dashboard question answering. Overall, 292 tasks encompassing 405 QA pairs are presented from 5 diverse category types, with 112 carefully chosen dashboards represented. Experimental results show this benchmark is a challenge for various types of VLMs assessed, with the best model achieving 38.69 %.
dc.identifier.urihttps://hdl.handle.net/10315/43616
dc.languageen
dc.rightsAuthor owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subjectComputer science
dc.subject.keywordsNatural language processing
dc.subject.keywordsQuestion answering
dc.subject.keywordsHuman Computer Interaction
dc.subject.keywordsData visualization
dc.subject.keywordsInteractive visualizations
dc.subject.keywordsDashboards
dc.subject.keywordsVision language models
dc.titleTowards Agentic Vision Language Models for Question Answering on Interactive Dashboard
dc.typeElectronic Thesis or Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kartha_Aaryaman_Sudhir_2025_MSc.pdf
Size:
28.64 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.87 KB
Format:
Plain Text
Description:
Loading...
Thumbnail Image
Name:
YorkU_ETDlicense.txt
Size:
3.39 KB
Format:
Plain Text
Description:

Collections