Instruction-Tuning For Chart Comprehension And Reasoning

Loading...
Thumbnail Image

Date

2025-04-10

Authors

Shah Mohammadi, Mehrad

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Charts provide visual representations of data and are widely used for analyzing information, addressing queries, and conveying insights to others. Various chart-related downstream tasks have emerged recently, such as questionanswering and summarization. A common strategy to solve these tasks is to fine-tune various models originally trained on vision tasks language. However, such task-specific models are not capable of solving a wide range of chartrelated tasks, constraining their real-world applicability. To overcome these challenges, we introduce ChartInstruct: a novel chart-specific vision-language Instruction-following dataset comprising if instructions generated with distinct charts. We then present two distinct systems for instruction tuning on such datasets: (1) an end-to-end model (2) a pipeline model employing a two-step approach. Evaluation shows that our instruction-tuning approach supports a wide array of real-world chart comprehension and reasoning scenarios, thereby expanding the scope and applicability of our models to new kinds of tasks.

Description

Keywords

Citation