Toward Trustworthy Automated Data Story Generation: Benchmarking, Multi-Agent Generation and Bias Evaluation in Data Storytelling

Islam, Mohammed Saidul

Toward Trustworthy Automated Data Story Generation: Benchmarking, Multi-Agent Generation and Bias Evaluation in Data Storytelling

Files

Islam_Mohammed_Saidul_2025_MSc.pdf (23.07 MB)

Date

2025-11-11

Authors

Islam, Mohammed Saidul

Abstract

Data-driven storytelling is a powerful method for conveying insights by combining narrative techniques with visualizations and text. In this thesis, we introduce a novel task for data story generation and a benchmark containing 1,449 stories from diverse sources. We propose a multi-step LLM agent framework mimicking the human storytelling process: one for planning and narration, and another for verification at each intermediary step. Results show that our proposed framework significantly outperforms non-agentic baselines. In parallel, we recognize that trustworthy storytelling must also be fair and unbiased. To this end, we conduct a largescale empirical study to uncover systematic geo-economic bias in the foundational subtask of data storytelling: producing narrative summaries of charts. We further explore inference-time debiasing strategies and highlight the need for more robust bias mitigation methods. Together, these contributions provide both a powerful generative system and a fairness-focused evaluation to ensure automated data storytelling is accurate, coherent, and ethically responsible.

Keywords

Computer science

URI

https://hdl.handle.net/10315/43335

Collections

Computer Science

Full item page

Toward Trustworthy Automated Data Story Generation: Benchmarking, Multi-Agent Generation and Bias Evaluation in Data Storytelling

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections