Last year, Databricks launched MLflow, an open source framework to manage the machine learning lifecycle that works with any ML library to simplify ML engineering. MLflow provides tools for experiment tracking, reproducible runs and model management that make machine learning applications easier to develop and deploy. In the past year, the MLflow community has grown quickly: 80 contributors from over 40 companies have contributed code to the project, and over 200 companies are using MLflow. In this talk, we’ll present our development plans for MLflow 1.0, the next release of MLflow, which will stabilize the MLflow APIs and introduce multiple new features to simplify the ML lifecycle. We’ll also discuss additional MLflow components that Databricks and other companies are working on for the rest of 2019, such as improved tools for model management, multi-step pipelines and online monitoring.
Matei Zaharia is an Assistant Professor of Computer Science at Stanford University and Chief Technologist at Databricks. He started the Apache Spark project during his PhD at UC Berkeley in 2009, and has worked broadly in datacenter systems, co-starting the Apache Mesos project and contributing as a committer on Apache Hadoop. Today, Matei tech-leads the MLflow development effort at Databricks. Matei’s research work was recognized through the 2014 ACM Doctoral Dissertation Award for the best PhD dissertation in computer science, an NSF CAREER Award and several best paper awards.
Aaron Davidson is an Apache Spark committer and software engineer at Databricks. His Spark contributions include standalone master fault tolerance, shuffle file consolidation, Netty-based block transfer service, and the external shuffle service. At Databricks, he leads the Performance and Storage team, working on the Databricks File System (DBFS) and automating the cloud infrastructure.
Greg Buehrer is a Distinguished Engineer at Microsoft, and is currently the Chief Architect of Azure Machine Learning. He has worked at Microsoft for 12 years in various roles, primarily using machine learning to solve business challenges in Adcenter and Bing Search. These included ad relevance, ad selection, core ranking, fraud detection, query understanding, spell correction, and other similar workloads. Greg has a Phd from the Ohio State University in performance data mining, with research contributions at conferences such as KDD, WSDM, ICDE, ICS, PPOPP, ICDM, and VLDB. Prior to attending graduate school, Greg obtained an undergraduate degree in Chemical Engineering. He spent most of the nine years before returning to university working abroad as a field engineer for Surface Combustion, an industrial steel OEM.