Skip to main content

Managed MLflow

Managing the complete machine learning lifecycle

managedmlflow

What is Managed MLflow?

Managed MLflow extends the functionality of MLflow, an open source platform developed by Databricks for machine learning lifecycle management, focusing on enterprise reliability, security and scalability. The latest update to MLflow introduces innovative LLMOps features that enhance its capability to manage and deploy large language models (LLMs). This expanded LLM support is achieved through new integrations with industry-standard LLM tools Hugging Face Transformers and OpenAI functions — as well as the MLflow AI Gateway. Additionally, MLflow’s integration with LangChain and Prompt Engineering UI enables simplified model development for creating generative AI applications for a variety of use cases, including chatbots, document summarization, text classification, sentiment analysis and beyond.

Benefits

model development

Model development

Enhance and expedite machine learning lifecycle management with a standardized framework for production-ready models. Managed MLflow Recipes enable seamless ML project bootstrapping, rapid iteration and large-scale model deployment. Craft applications like chatbots, document summarization, sentiment analysis and classification effortlessly. Easily develop generative AI apps (e.g., chatbots, doc summarization) with MLflow’s AI Gateway and Prompt Engineering, seamlessly integrated with LangChain, Hugging Face and OpenAI.

Deploy a model for a batch interface

Experiment tracking

Run experiments with any ML library, framework or language, and automatically keep track of parameters, metrics, code and models from each experiment. By using MLflow on Databricks, you can securely share, manage and compare experiment results along with corresponding artifacts and code versions — thanks to built-in integrations with the Databricks Workspace and notebooks.

model development

Model management

Use one central place to discover and share ML models, collaborate on moving them from experimentation to online testing and production, integrate with approval and governance workflows and CI/CD pipelines, and monitor ML deployments and their performance. The MLflow Model Registry facilitates sharing of expertise and knowledge, and helps you stay in control.

model development

Model deployment

Quickly deploy production models for batch inference on Apache Spark™ or as REST APIs using built-in integration with Docker containers, Azure ML or Amazon SageMaker. With Managed MLflow on Databricks, you can operationalize and monitor production models using Databricks Jobs Scheduler and auto-managed Clusters to scale based on the business needs.

The latest upgrades to MLflow seamlessly package gen AI applications for deployment. You can now deploy your chatbots, and other gen AI applications such as document summarization, sentiment analysis and classification, at scale using Databricks Model Serving.

Features

icon-orange-Data-Parser-Normalizer-ETL-ELT

MLflow Tracking

MLFLOW TRACKING: Automatically log parameters, code versions, metrics, and artifacts for each run using Python, REST, R API, and Java API

PROMPT ENGINEERING: Simplified model development to build gen AI applications for a variety of use cases such as chatbots, document summarization, sentiment analysis and classification with MLflow’s AI Gateway and Prompt Engineering supported by native integration with LangChain, and seamless, no-code UI for fast prototyping and iteration.

MLFLOW TRACKING SERVER: Get started quickly with a built-in tracking server to log all runs and experiments in one place. No configuration needed on Databricks.

EXPERIMENT MANAGEMENT: Create, secure, organize, search, and visualize experiments from within the Workspace with access control and search queries.

MLFLOW RUN SIDEBAR: Automatically track runs from within notebooks and capture a snapshot of your notebook for each run, so that you can always go back to previous versions of your code.

LOGGING DATA WITH RUNS: Log parameters, data sets, metrics, artifacts and more as runs to local files, to a SQLAlchemy compatible database, or remotely to a tracking server.

DELTA LAKE INTEGRATION: Track large-scale data sets that fed your models with Delta Lake snapshots.

ARTIFACT STORE: Store large files such as S3 buckets, shared NFS file system, and models in Amazon S3, Azure Blob Storage, Google Cloud Storage, SFTP server, NFS, and local file paths.

icon connectors

MLflow Models

MLFLOW MODELS: A standard format for packaging machine learning models that can be used in a variety of downstream tools — for example, real-time serving through a REST API or batch inference on Apache Spark.

MODEL CUSTOMIZATION: Use Custom Python Models and Custom Flavors for models from an ML library that is not explicitly supported by MLflow’s built-in flavors.

BUILT-IN MODEL FLAVORS: MLflow provides several standard flavors that might be useful in your applications, like Python and R functions, Hugging Face, OpenAI and LangChain, PyTorch, Spark MLlib, TensorFlow, and ONNX.

BUILT-IN DEPLOYMENT TOOLS: Quickly deploy on Databricks via Apache Spark UDF for a local machine, or several other production environments such as Microsoft Azure ML, Amazon SageMaker, and building Docker Images for Deployment.

icon-orange-Join-Hints

MLflow Model Registry

CENTRAL REPOSITORY: Register MLflow models with the MLflow Model Registry. A registered model has a unique name, version, stage, and other metadata.

MODEL VERSIONING: Automatically keep track of versions for registered models when updated.

MODEL STAGE: Assign preset or custom stages to each model version, like “Staging” and “Production” to represent the lifecycle of a model.

CI/CD WORKFLOW INTEGRATION: Record stage transitions, request, review and approve changes as part of CI/CD pipelines for better control and governance.

MODEL STAGE TRANSITIONS: Record new registration events or changes as activities that automatically log users, changes, and additional metadata such as comments.

icon

MLflow AI Gateway

GOVERN ACCESS TO LLMS: Manage SaaS LLM credentials

CONTROL COSTS: Set up rate limits

STANDARDIZE LLM INTERACTIONS: Experiment with different OSS/SaaS LLMs with standard input/output interfaces for different tasks: completions, chat, embeddings

mlflow recipes

MLflow Recipes

SIMPLIFIED PROJECT STARTUP: MLflow Recipes provides out-of-box connected components for building and deploying ML models.

ACCELERATED MODEL ITERATION: MLflow Recipes creates standardized, reusable steps for model iteration — making the process faster and less expensive.

AUTOMATED TEAM HANDOFFS: Opinionated structure provides modularized production-ready code, enabling automatic handoff from experimentation to production.

Predictive maintenance

MLflow Projects

MLFLOW PROJECTS: MLflow projects allow you to specify the software environment that is used to execute your code. MLflow currently supports the following project environments: Conda environment, Docker container environment, and system environment. Any Git repo or local directory can be treated as an MLflow project.

REMOTE EXECUTION MODE: Run MLflow Projects from Git or local sources remotely on Databricks clusters using the Databricks CLI to quickly scale your code.

See our Product News from Azure Databricks and AWS to learn more about our latest features.

Comparing MLflow offerings

Open Source MLflow

Managed MLflow on Databricks

Experiment Tracking

MLflow tracking API

MLflow tracking server

Self-hosted

Fully managed

Notebooks integration

Workflows integration

Reproducible Projects

MLflow Projects

Git and Conda integration

Scalable cloud/clusters for project runs

Model Management

MLflow Model Registry

Model versioning

ACL-based stage transition

CI/CD workflow integrations

Flexible Deployment

Built-in batch inference

MLflow Models

Built-in streaming analytics

Security and Management

High availability

Automated updates

Role-based access control

How it works

MLflow is a lightweight set of APIs and user interfaces that can be used with any ML framework throughout the Machine Learning workflow. It includes four components: MLflow Tracking, MLflow Projects, MLflow Models and MLflow Model Registry

More about MLflow

managed mlflow
MLflow Tracking

Record and query experiments: code, data, config, and results.

Learn more
managed mlflow
MLflow Projects

Packaging format for reproducible runs on any platform.

Learn more
managed mlflow
MLflow Models

General format for sending models to diverse deployment tools.

Learn more
managed mlflow
MLflow Model Registry

Centralized repository to collaboratively manage MLflow models throughout the full lifecycle.

Learn more

Managed MLflow on Databricks

Managed MLflow on Databricks is a fully managed version of MLflow providing practitioners with reproducibility and experiment management across Databricks Notebooks, Jobs, and data stores, with the reliability, security, and scalability of the Databricks Lakehouse Platform.

Read the docs

Log Your First Run as an Experiment MLflow

Resources