OptiServe: Cost-Aware, Performance-Driven, and Accuracy-Tuned Serverless Applications with ML Workloads

dc.contributor.advisorKhazaei, Hamzeh
dc.contributor.authorBoukani, Arian
dc.date.accessioned2026-03-10T16:20:39Z
dc.date.available2026-03-10T16:20:39Z
dc.date.copyright2026-01-29
dc.date.issued2026-03-10
dc.date.updated2026-03-10T16:20:39Z
dc.degree.disciplineComputer Science
dc.degree.levelMaster's
dc.degree.nameMSc - Master of Science
dc.description.abstractServerless computing has emerged as a popular cloud paradigm due to its seamless scalability and cost-efficient, pay-as-you-go pricing model. Its potential to support machine learning (ML) inference workloads,including generative AI tasks, has led to growing adoption of ML functions within serverless applications. A key challenge, however, is selecting suitable ML models that balance execution time, deployment cost, and inference accuracy in latency- and cost-sensitive environments. In this study, we present a framework for optimizing serverless applications that incorporate ML components through tri-objective optimization. We develop high-fidelity analytical models, augmented with lightweight profiling, to capture the trade-offs among cost, performance, and accuracy across different model choices. These models serve as the foundation for guiding ML model selection and deployment strategies to meet application-specific service-level objectives. We validate our framework through real-world experiments on AWS using real serverless applications. Furthermore, we demonstrate its practicality by performing extensive what-if analyses, exploring a wide range of application scenarios and configurations, in under a minute. Our extensive experiments on real-world applications show that OptiServe recommends memory and ML model configurations that achieve over 95% of the accuracy of ideal configurations in 89.64% of cases, enabling efficient, low-cost deployments while maintaining model accuracy and meeting performance targets.
dc.identifier.urihttps://hdl.handle.net/10315/43652
dc.languageen
dc.rightsAuthor owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subjectComputer science
dc.subject.keywordsServerless computing
dc.subject.keywordsML inference
dc.subject.keywordsFunction and application modelling
dc.subject.keywordsMulti-objective optimization
dc.titleOptiServe: Cost-Aware, Performance-Driven, and Accuracy-Tuned Serverless Applications with ML Workloads
dc.typeElectronic Thesis or Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Boukani_Arian_2026_MSc.pdf
Size:
1.27 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.87 KB
Format:
Plain Text
Description:
Loading...
Thumbnail Image
Name:
YorkU_ETDlicense.txt
Size:
3.39 KB
Format:
Plain Text
Description:

Collections