Enamul Hoque PrinceParsa Kavehzadeh2023-12-082023-12-082023-12-08https://hdl.handle.net/10315/41672Charts are widely used for data analysis, providing visual representations and insights into complex data. To facilitate chart-based data analysis using natural language, several downstream tasks have been introduced recently including chart question answering. However, existing methods for these tasks often rely on pretraining on language or vision-language tasks, neglecting the explicit modeling of chart structures. To address this, we first build a large corpus of charts covering diverse topics and visual styles. We then present UniChart, a pretrained model for chart comprehension and reasoning. We propose several chart-specific pretraining tasks that include: (i) low-level tasks to extract the visual elements (e.g., bars, lines) and data from charts, and (ii) high-level tasks to acquire chart understanding and reasoning skills. Our experiments demonstrate that pretraining UniChart on a large corpus with chart-specific objectives, followed by fine-tuning, yields state-of-the-art performance on four downstream tasks. Moreover, our model exhibits superior generalizability to unseen chart corpus, surpassing previous approaches that lack chart-specific objectives and utilize limited chart resources.Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.Artificial intelligenceChart Question Answering with an Universal Vision-Language Pretraining ApproachElectronic Thesis or Dissertation2023-12-08natural language processinginformation visualizationchartschart question answeringquestion answeringpretrainingchart comprehensiontransformers