Machine Learning in Production

Summary

In this 1-day course, machine learning engineers, data engineers, and data scientists learn the best practices for managing the complete machine learning lifecycle from experimentation and model management through various deployment modalities and production issues. Students begin with end-to-end reproducibility of machine learning models using MLflow including data management, experiment tracking, and model management before deploying models with batch, streaming, and real-time as well as addressing related monitoring, alerting, and CI/CD issues. Sample code accompanies all modules and theoretical concepts.

Onsite Training

request quote

Public Training

Don’t see a date that works for you?

request class

Description

First, this course explores managing the experimentation process using MLflow with a focus on end-to-end reproducibility including data, model, and experiment tracking. Second, students operationalize their models by integrating with various downstream deployment tools including saving models to the MLflow model registry, managing artifacts and environments, and automating the testing of their models. Third, students implement batch, streaming, and real-time deployment options. Finally, additional production issues including continuous integration, continuous deployment are covered as well as monitoring and alerting.

By the end of this course, you will have built an end-to-end pipeline to log, deploy, and monitor machine learning models. This course is taught entirely in Python.

Duration

8 hours

Objectives

Upon completion, students should be able to:

  • Track data and machine learning experiments to organize the machine learning life cycle
  • Create, organize, and package machine learning projects with a focus on reproducibility and using a model registry to collaborate with a team
  • Develop a generalizable way of handling machine learning models created in and deployed to a variety of environments
  • Deploy basic CI/CD infrastructure using webhooks
  • Explore the various production issues encountered in deploying and monitoring machine learning models
  • Introduce various strategies for deploying models using batch, streaming, and real-time
  • Explore various statistically rigorous solutions to drift and implement basic retraining methods

Audience

  • Data Scientist
  • Machine Learning Engineer
  • Data Engineer

Prerequisites

  • Intermediate experience using Python/pandas
  • Working knowledge of machine learning and data science (scikit-learn, TensorFlow, etc.)
  • Familiarity with Apache Spark
  • Basic familiarity with object storage, databases, and networking

Outline

Day #1 AM
Time Lesson Description
40 min Introductions, Setup & MLflow Overview Registration, Courseware & Q&As
20 min Data Management Design patterns to manage data lineage for data reproducibility
10 min Break
30 min Experiment Tracking Tracking experiments to organize the machine learning life cycle
30 min Advanced Experiment Tracking Advanced methods for tracking ML experiments
20 min Model Management Creating, organizing, and packaging machine learning projects with pre-processing code
20 min Model Registry Artifact management for production models
20 min Webhooks and Testing Integrating MLflow webhooks into the Model Registry
Day #1 PM
Time Lesson Description
30 min Production Issues Various production issues encountered in deploying and monitoring machine learning models
30 min Batch Deployment Various strategies for deploying models using batch including pure Python, Spark, and on the JVM
10 min Break
20 min Streaming Deployment How to perform inference on a stream of incoming data
40 min Real Time Deployment Real time deployment with a focus on RESTful services
10 min Break
30 min CI/CD Continuous Integration, Continuous Deployment of ML models
20 min Drift Monitoring Explore solutions to concept and data drift
10 min Alerting Alerting strategies using email and REST integration

Upcoming Classes

Date Time Location Price
Sep 20 9:00 AM – 5:00 PM CEST Online – Virtual – EMEA $ 1000.00 USD Register →
Oct 6 – 7 9:00 AM – 1:00 PM PDT Online – Virtual – Americas (half-day schedule) $ 1000.00 USD Register →
Oct 20 9:00 AM – 5:00 PM CEST Online – Virtual – EMEA $ 1000.00 USD Register →
Nov 17 – 18 9:00 AM – 1:00 PM PST Online – Virtual – Americas (half-day schedule) $ 1000.00 USD Register →
Nov 24 9:00 AM – 5:00 PM CET Online – Virtual – EMEA $ 1000.00 USD Register →
Dec 17 9:00 AM – 5:00 PM CET Online – Virtual – EMEA $ 1000.00 USD Register →
Dec 28 – 29 9:00 AM – 1:00 PM PST Online – Virtual – Americas (half-day schedule) $ 1000.00 USD Register →