Query Generation for Database Testing Via Machine Learning

dc.contributor.advisorYu, Xiaohui
dc.contributor.authorYang, Yongtai
dc.date.accessioned2026-03-10T16:14:19Z
dc.date.available2026-03-10T16:14:19Z
dc.date.copyright2025-12-05
dc.date.issued2026-03-10
dc.date.updated2026-03-10T16:14:19Z
dc.degree.disciplineInformation Systems and Technology
dc.degree.levelMaster's
dc.degree.nameMA - Master of Arts
dc.description.abstractModern database management systems (DBMSs) are important to data-driven applications. However, testing DBMS bugs is still a challenging task as the DBMS is a very complex system. Bugs in the DBMS often appear only under specific execution plan patterns, such as nested-loop joins combined with aggregation. Reproducing such bugs requires generating SQL queries whose execution plans contain the pattern that triggers the bug. Existing rule-based query generators and learning-based approaches both fail to generate queries under the execution plan pattern constraint. To overcome this limitation, we propose QueryMorpher, a plan-driven query generation framework that generates SQL queries from the problematic execution plan that triggers the bug. QueryMorpher begins with a problematic execution plan and a plan pattern that triggers the bug, and implements a sequence of learned plan mutation operations guided by a sequence-to-sequence model. The mutated plan is then translated back into SQL by using a plan-to-query translation module, which guarantees that the resulting query reproduces the desired execution plan while remaining syntactically and semantically valid. Experimental results demonstrate that QueryMorpher can generate diverse and valid queries whose execution plans contain the user-defined patterns. On TPC-H, QueryMorpher achieves a target-pattern rate of 0.6 vs 0.4 for the best baseline, while maintaining 10% higher plan diversity under the same budget. On TPC-DS, QueryMorpher achieves similar improvements, indicating that QueryMorpher is stable on different database schemas. By bridging the gap between query generation and query execution plan control, QueryMorpher enables automated and controllable DBMS testing.
dc.identifier.urihttps://hdl.handle.net/10315/43605
dc.languageen
dc.rightsAuthor owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subjectInformation technology
dc.subjectComputer science
dc.subjectInformation science
dc.subject.keywordsDatabase testing
dc.subject.keywordsMachine learning
dc.subject.keywordsQuery generation
dc.titleQuery Generation for Database Testing Via Machine Learning
dc.typeElectronic Thesis or Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yang_Yongtai_2025_MA.pdf
Size:
1.49 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.87 KB
Format:
Plain Text
Description:
Loading...
Thumbnail Image
Name:
YorkU_ETDlicense.txt
Size:
3.39 KB
Format:
Plain Text
Description: