YorkSpace has migrated to a new version of its software. Access our Help Resources to learn how to use the refreshed site. Contact diginit@yorku.ca if you have any questions about the migration.
 

An Exploratory Study on the Platforms of Sharing Reusable Machine Learning Models

Loading...
Thumbnail Image

Date

2021-03-08

Authors

Xiu, Minke

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Recent advances in Artificial Intelligence, especially in Machine Learning (ML), have brought applications previously considered as science fiction (e.g., virtual personal assistants and autonomous cars) into the reach of millions of everyday users. Since modern ML technologies like deep learning require considerable technical expertise and resource to build custom models, reusing existing models trained by experts has become essential. Currently the ML models are shared, distributed, or retailed on multiple ML model platforms which can be divided into two categories based on their usage patterns: (1) ML model stores whose models can be deployed and served with the help of cloud infrastructure, and (2) ML package repositories whose models are free but need to be deployed and used (e.g., embedded into users applications as a software component) manually.

We conducted an exploratory study on the above two categories of ML model platforms: ML model stores and ML package repositories. We analyzed the structure and the contents of the ML models platforms, as well as functionalities provided by the package managers. The research subjects were three general purpose ML model stores (AWS marketplace, ModelDepot, and Wolfram neural net repository) and two popular ML package repositories (TensorFlow Hub and PyTorch Hub). When studying the structure of ML model platforms and functionalities of package managers, we compared them against their counterparts from traditional software development: ML model stores vs. mobile app stores (e.g., Google Play and Apple App Store), and ML package repositories vs. programming language package repositories (e.g., npm, PyPI, and CRAN). Through our study, we identified special software engineering practices and challenges for sharing, distributing, and retailing ML models. The implications from this thesis will be helpful for stakeholders to make the ML model platforms better serve the users (i.e., software engineers, data scientists and researchers).

Description

Keywords

Computer Engineering

Citation