Machine learning suffers from a reproducibility crisis. Deterministic machine learning is incredibly important for academia to verify papers, but also for developers to debug, audit and regress models.
Due to the various reasons for non-deterministic ML, especially when GPUs are in play, I conducted several experiments and identified all causes and the corresponding solutions (if available).
Based on these solutions I developed mlf-core (https://mlf-core.com), which provides very sophisticated CPU and GPU deterministic project templates based on MLflow for Pytorch, Tensorflow and XGBoost. A custom linter ensures that models are deterministic at any point.
Speaker: Lukas Heumos
University of Tübingen / Quantitative Biology Center Tübingen
Lukas Heumos is a research software engineer, with degrees in Bioinformatics, at the Quantitative Biology Center, Tübingen. As part of his scientific efforts he conducts research in reproducible bioinformatics workflows. Based on these experiences he now leads the endeavor of enabling deterministic and even replicable machine learning with mlf-core.
The passionate open-source contributor and hackathon enthusiast was awarded the 2019 University of Tübingen award for exceptional student commitment and was accepted as a Lindau Nobel Laureate Young scientist for the 70th Lindau Nobel Laureate Meeting.