Built on open lakehouse architecture, Databricks Machine Learning empowers ML teams to prepare and process data, streamlines cross-team collaboration and standardizes the full lifecycle from experimentation to production.
Simplify all aspects of data for ML
Because Databricks ML is built on an open lakehouse foundation with Delta Lake, you can empower your machine learning teams to access, explore and prepare any type of data at any scale. Turn features into production pipelines in a self-service manner without depending on data engineering support.
Automate experiment tracking and governance
Managed MLflow automatically tracks your experiments and logs parameters, metrics, versioning of data and code, as well as model artifacts with each training run. You can quickly see previous runs, compare results and reproduce a past result, as needed. Once you have identified the best version of a model for production, register it to the Model Registry to simplify handoffs along the deployment lifecycle.
Manage the full model lifecycle with the Model Registry
Once trained models are registered, you can collaboratively manage them through their lifecycle with the Model Registry. Models can be versioned and moved through various stages, like experimentation, staging, production and archived. The lifecycle management integrates with approval and governance workflows according to role-based access controls. Comments and email notifications provide a rich collaborative environment for data teams.
Deploy ML models at scale and low latency
From the Model Registry, quickly deploy production models using batch scoring for scale, or Databricks Serving for low-latency online serving as REST API endpoints. Because the Model Registry relies on the MLflow Model format, it benefits from ecosystem integrations for a wide variety of deployments, like deploying Docker containers on Kubernetes or loading a model onto a device.
Databricks notebooks natively support Python, R, SQL and Scala so practitioners can work together with the languages and libraries of their choice to discover, visualize and share insights.
Machine Learning Runtime
One-click access to preconfigured ML-optimized clusters, powered by a scalable and reliable distribution of the most popular ML frameworks (such as PyTorch, TensorFlow and scikit-learn), with built-in optimizations for unmatched performance at scale.Learn More
Facilitate the reuse of features with a data lineage–based feature search that leverages automatically logged data sources. Make features available for training and serving with simplified model deployment that doesn’t require changes to the client application.Learn More
Empower everyone from ML experts to citizen data scientists with a “glass box” approach to AutoML that delivers not only the highest performing model, but also generates code for further refinement by experts.Learn More
Built on top of MLflow — the world’s leading open source platform for the ML lifecycle — Managed MLflow helps ML models quickly move from experimentation to production, with enterprise security, reliability and scale.Learn More
One-click deployment of any ML model as a REST endpoint for low latency serving. Integrates with the Model Registry to manage staging and production versions of endpoints.
eBooks and blogs
Ready to get started?