Manasi Vartak is the founder and CEO of Verta, an MIT spinoff building an open-core MLOps platform for the full ML lifecycle. Verta grew out of Manasi’s Ph.D. work at MIT on ModelDB, the first open-source model management system deployed at Fortune 500 companies. The Verta MLOps platform enables data scientists and ML engineers to robustly take trained ML models through the end-to-end MLOps cycle including versioning, packaging, release, operations, and monitoring. Previously, Manasi worked on feed ranking at Twitter and dynamic ad-targeting at Google. Manasi has spoken at several top research as well as industrial conferences such as the Strata O’Reilly Conference, SIGMOD, VLDB, Spark Summit, and AnacondaCON, and has authored a course on model management.
May 26, 2021 04:25 PM PT
For any organization whose core product or business depends on ML models (think Slack search, Twitter feed ranking, or Tesla Autopilot), ensuring that production ML models are performing with high efficacy is crucial. In fact, according to the McKinsey report on model risk, defective models have led to revenue losses of hundreds of millions of dollars in the financial sector alone. However, in spite of the significant harms of defective models, tools to detect and remedy model performance issues for production ML models are missing.
Based on our experience building ML debugging and robustness tools at MIT CSAIL and managing large-scale model inference services at Twitter, Nvidia, and now at Verta, we developed a generalized model monitoring framework that can monitor a wide variety of ML models, work unchanged in batch and real-time inference scenarios, and scale to millions of inference requests. In this talk, we focus on how this framework applies to monitoring ML inference workflows built on top of Apache Spark and Databricks. We describe how we can supplement the massively scalable data processing capabilities of these platforms with statistical processors to support the monitoring and debugging of ML models.
Learn how ML Monitoring is fundamentally different from application performance monitoring or data monitoring. Understand what model monitoring must achieve for batch and real-time model serving use cases. Then dig in with us as we focus on the batch prediction use case for model scoring and demonstrate how we can leverage the core Apache Spark engine to easily monitor model performance and identify errors in serving pipelines.
February 7, 2017 04:00 PM PT
Building a machine learning model is an iterative process. A data scientist will build many tens to hundreds of models before arriving at one that meets some acceptance criteria. However, the current style of model building is ad-hoc and there is no practical way for a data scientist to manage models that are built over time. In addition, there are no means to run complex queries on models and related data.
In this talk, we present ModelDB, a novel end-to-end system for managing machine learning (ML) models. Using client libraries, ModelDB automatically tracks and versions ML models in their native environments (e.g. spark.ml, scikit-learn). A common set of abstractions enable ModelDB to capture models and pipelines built across different languages and environments. The structured representation of models and metadata then provides a platform for users to issue complex queries across various modeling artifacts. Our rich web frontend provides a way to query ModelDB at varying levels of granularity.
ModelDB has been open-sourced at https://github.com/mitdbg/modeldb.