On July 19, we held our monthly Bay Area Spark Meetup (BASM) at Databricks, HQ in San Francisco. At the Spark + AI Summit in June, we announced two open-source projects: Project Hydrogen and MLflow.
Partly to continue sharing the progress of these open-source projects with the community and partly to encourage community contributions, two leading engineers from the respective teams talked at this meetup, providing technical details, roadmaps, and how the community can get involved.
First, Xiangrui Meng presented Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark, in which he elaborated minor shortcomings in using Apache Spark with deep learning frameworks and how this endeavor resolves it and integrates these frameworks as first-class citizens, taking advantage of Spark’s distributed computing nature and fault-tolerance capabilities at scale.
Second, Aaron Davidson shared MLflow: Infrastructure for a Complete Machine Learning Life Cycle. He spoke about challenges in ML cycle today and detailed how experimenting, tracking, deploying, and serving machine learning models can be achieved using MLflow’s modular components and APIs for an end-to-end machine learning model lifecycle.
View Slides to Project Hydrogen Presentation
View Slides to MLflow Presentation
You can peruse slides and watch the video at your leisure. To those who helped and attended, thank you for participating and continued community support.