Query Generation for Database Testing Via Machine Learning

Loading...
Thumbnail Image

Authors

Yang, Yongtai

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Modern database management systems (DBMSs) are important to data-driven applications. However, testing DBMS bugs is still a challenging task as the DBMS is a very complex system. Bugs in the DBMS often appear only under specific execution plan patterns, such as nested-loop joins combined with aggregation. Reproducing such bugs requires generating SQL queries whose execution plans contain the pattern that triggers the bug. Existing rule-based query generators and learning-based approaches both fail to generate queries under the execution plan pattern constraint. To overcome this limitation, we propose QueryMorpher, a plan-driven query generation framework that generates SQL queries from the problematic execution plan that triggers the bug. QueryMorpher begins with a problematic execution plan and a plan pattern that triggers the bug, and implements a sequence of learned plan mutation operations guided by a sequence-to-sequence model. The mutated plan is then translated back into SQL by using a plan-to-query translation module, which guarantees that the resulting query reproduces the desired execution plan while remaining syntactically and semantically valid.

Experimental results demonstrate that QueryMorpher can generate diverse and valid queries whose execution plans contain the user-defined patterns. On TPC-H, QueryMorpher achieves a target-pattern rate of 0.6 vs 0.4 for the best baseline, while maintaining 10% higher plan diversity under the same budget. On TPC-DS, QueryMorpher achieves similar improvements, indicating that QueryMorpher is stable on different database schemas. By bridging the gap between query generation and query execution plan control, QueryMorpher enables automated and controllable DBMS testing.

Description

Keywords

Information technology, Computer science, Information science

Citation