Toward Trustworthy Automated Data Story Generation: Benchmarking, Multi-Agent Generation and Bias Evaluation in Data Storytelling

Islam, Mohammed Saidul

Toward Trustworthy Automated Data Story Generation: Benchmarking, Multi-Agent Generation and Bias Evaluation in Data Storytelling

dc.contributor.advisor	Prince, Enamul Hoque
dc.contributor.author	Islam, Mohammed Saidul
dc.date.accessioned	2025-11-11T20:08:47Z
dc.date.available	2025-11-11T20:08:47Z
dc.date.copyright	2025-08-20
dc.date.issued	2025-11-11
dc.date.updated	2025-11-11T20:08:46Z
dc.degree.discipline	Computer Science
dc.degree.level	Master's
dc.degree.name	MSc - Master of Science
dc.description.abstract	Data-driven storytelling is a powerful method for conveying insights by combining narrative techniques with visualizations and text. In this thesis, we introduce a novel task for data story generation and a benchmark containing 1,449 stories from diverse sources. We propose a multi-step LLM agent framework mimicking the human storytelling process: one for planning and narration, and another for verification at each intermediary step. Results show that our proposed framework significantly outperforms non-agentic baselines. In parallel, we recognize that trustworthy storytelling must also be fair and unbiased. To this end, we conduct a largescale empirical study to uncover systematic geo-economic bias in the foundational subtask of data storytelling: producing narrative summaries of charts. We further explore inference-time debiasing strategies and highlight the need for more robust bias mitigation methods. Together, these contributions provide both a powerful generative system and a fairness-focused evaluation to ensure automated data storytelling is accurate, coherent, and ethically responsible.
dc.identifier.uri	https://hdl.handle.net/10315/43335
dc.language	en
dc.rights	Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subject	Computer science
dc.subject.keywords	Data storytelling
dc.subject.keywords	Bias evaluation
dc.subject.keywords	Large language models
dc.subject.keywords	LLM agents
dc.title	Toward Trustworthy Automated Data Story Generation: Benchmarking, Multi-Agent Generation and Bias Evaluation in Data Storytelling
dc.type	Electronic Thesis or Dissertation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Islam_Mohammed_Saidul_2025_MSc.pdf
Size:: 23.07 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: license.txt
Size:: 1.87 KB
Format:: Plain Text
Description:

Download

Name:: YorkU_ETDlicense.txt
Size:: 3.39 KB
Format:: Plain Text
Description:

Download

Collections

Computer Science