TY - CHAP
T1 - Making a Pipeline Production-Ready
T2 - Challenges and Lessons Learned in the Healthcare Domain
AU - Lawand, Daniel Angelo Esteves
AU - Lam, Lucas Quaresma Medina
AU - Bolgheroni, Roberto Oliveira
AU - Ferreira, Renato Cordeiro
AU - Goldman, Alfredo
AU - Finger, Marcelo
PY - 2025/5/9
Y1 - 2025/5/9
N2 - Deploying a Machine Learning (ML) training pipeline into production requires good software engineering practices. Unfortunately, the typical data science workflow often leads to code that lacks critical software quality attributes. This experience report investigates this problem in SPIRA, a project whose goal is to create an ML-Enabled System (MLES) to pre-diagnose insufficiency respiratory via speech analysis. This paper presents an overview of the architecture of the MLES, then compares three versions of its Continuous Training subsystem: from a proof of concept Big Ball of Mud (v1), to a design pattern-based Modular Monolith (v2), to a test-driven set of Microservices (v3). Each version improved its overall extensibility, maintainability, robustness, and resiliency. The paper shares challenges and lessons learned in this process, offering insights for researchers and practitioners seeking to productionize their pipelines.
AB - Deploying a Machine Learning (ML) training pipeline into production requires good software engineering practices. Unfortunately, the typical data science workflow often leads to code that lacks critical software quality attributes. This experience report investigates this problem in SPIRA, a project whose goal is to create an ML-Enabled System (MLES) to pre-diagnose insufficiency respiratory via speech analysis. This paper presents an overview of the architecture of the MLES, then compares three versions of its Continuous Training subsystem: from a proof of concept Big Ball of Mud (v1), to a design pattern-based Modular Monolith (v2), to a test-driven set of Microservices (v3). Each version improved its overall extensibility, maintainability, robustness, and resiliency. The paper shares challenges and lessons learned in this process, offering insights for researchers and practitioners seeking to productionize their pipelines.
KW - Code Quality
KW - MLOps
KW - Software Architecture
KW - Machine Learning Enabled Systems
KW - Healthcare Domain
KW - Experience Report
UR - https://doi.org/10.1007/978-3-032-04403-7_30
U2 - 10.1007/978-3-032-04403-7_30
DO - 10.1007/978-3-032-04403-7_30
M3 - Chapter
SN - 978-3-032-04402-0
VL - 15982
T3 - Lecture Notes in Computer Science
SP - 354
EP - 362
BT - Software Architecture
ER -