Improving the Reliability of AI Infrastructure Software with Data-Driven Software Analytics

Shiri Harzevili, Nima

Improving the Reliability of AI Infrastructure Software with Data-Driven Software Analytics

dc.contributor.advisor	Wang, Song
dc.contributor.author	Shiri Harzevili, Nima
dc.date.accessioned	2025-04-10T10:59:25Z
dc.date.available	2025-04-10T10:59:25Z
dc.date.copyright	2025-02-11
dc.date.issued	2025-04-10
dc.date.updated	2025-04-10T10:59:24Z
dc.degree.discipline	Electrical Engineering & Computer Science
dc.degree.level	Doctoral
dc.degree.name	PhD - Doctor of Philosophy
dc.description.abstract	Today, AI systems are increasingly used in safety-critical fields like transportation, finance, and robotics. While AI offers many benefits that simplify daily life, its widespread adoption has also increased threats, highlighting the urgent need for secure AI. Failing to protect AI systems against security threats could have disastrous consequences. Like traditional software, AI applications are built upon multiple layers: application and service, model, framework, library and compiler, and hardware. In this thesis, we first conduct an empirical study to characterize and understand security weaknesses in AI frameworks. We identified Memory Leak (CWE-401) and Integer Overflow (CWE-190) as the two most prevalent bug types, with common root causes being improper validation of tensor properties and poor memory management. Next, we assess the effectiveness of five popular static analysis tools for identifying bugs in AI frameworks. Our study shows that these tools detect only a small fraction of bugs. Key limitations include lacking support for AI-specific macros/APIs, tensor data types, and computation graphs. We then evaluate dynamic analysis techniques, specifically DL fuzz testing tools, on real-world bugs in AI frameworks. Our findings show that DL fuzzers detect only 6.5% (34 out of 517) of unique bugs in our benchmark dataset. We also identify two main factors limiting the effectiveness of these tools. Based on these findings, we developed a novel API-level DL fuzzer called Orion to address the limitations of existing fuzzers and identify new bugs in AI backend implementations. Our study confirms that most bugs stem from inadequate checks on tensor properties. In the final chapter, we characterize DL checker bugs and propose TensorGuard, an innovative tool designed to detect and repair such bugs. TensorGuard achieves an accuracy of 11.1%, surpassing the state-of-the-art bug repair baseline by 2%. We also tested TensorGuard on six months of checker-related updates (493 changes) in Google’s JAX library, successfully detecting 64 checker bugs. Taken together, the findings from the five studies provide robust evidence that using data-driven software analytics to mine publicly available historical repositories of AI frameworks—such as code repositories and bug databases—holds immense potential for advancing the reliability of AI infrastructure software.
dc.identifier.uri	https://hdl.handle.net/10315/42887
dc.language	en
dc.rights	Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subject.keywords	Bug
dc.subject.keywords	Vulnerability
dc.subject.keywords	Deep learning
dc.subject.keywords	Static code analysis
dc.subject.keywords	Fuzz testing
dc.subject.keywords	Code generation
dc.title	Improving the Reliability of AI Infrastructure Software with Data-Driven Software Analytics
dc.type	Electronic Thesis or Dissertation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Shiri_Harzevili_Nima_2025_PhD.pdf
Size:: 5.62 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: license.txt
Size:: 1.87 KB
Format:: Plain Text
Description:

Download

Name:: YorkU_ETDlicense.txt
Size:: 3.39 KB
Format:: Plain Text
Description:

Download

Collections

Electrical Engineering and Computer Science