Towards Modular Machine Learning Pipelines

Abstract

Pipelines of Machine Learning (ML) components are a popular and effective approach to divide and conquer many business-critical problems. A pipeline architecture implies a specific division of the overall problem, however current ML training approaches do not enforce this implied division. Consequently ML components can become coupled to one another after they are trained, which causes insidious effects. For instance, even when one coupled ML component in a pipeline is improved in isolation, the end-to-end pipeline performance can degrade. In this paper, we develop a conceptual framework to study ML coupling in pipelines and design new modularity regularizers that can eliminate coupling during ML training. We show that the resulting ML pipelines become modular (i.e., their components can be trained independently of one another) and discuss the tradeoffs of our approach versus existing approaches to pipeline optimization.

Cite

Text

Modi et al. "Towards Modular Machine Learning Pipelines." ICML 2023 Workshops: LLW, 2023.

Markdown

[Modi et al. "Towards Modular Machine Learning Pipelines." ICML 2023 Workshops: LLW, 2023.](https://mlanthology.org/icmlw/2023/modi2023icmlw-modular/)

BibTeX

@inproceedings{modi2023icmlw-modular,
  title     = {{Towards Modular Machine Learning Pipelines}},
  author    = {Modi, Aditya and Kaur, Jivat Neet and Makar, Maggie and Mallapragada, Pavan and Sharma, Amit and Kiciman, Emre and Swaminathan, Adith},
  booktitle = {ICML 2023 Workshops: LLW},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/modi2023icmlw-modular/}
}