Databricks Runtime for Machine Learning - Databricks

Databricks Runtime for Machine Learning

Ready-to-use machine learning environment

The Databricks Runtime for Machine Learning (ML) provides data scientists and ML practitioners with scalable clusters that include popular frameworks like TensorFlow, Keras, PyTorch, scikit-learn and more to train cutting-edge machine learning and deep learning models.

Scikit Learn

How it works

Databricks Runtime for ML is built on top and updated with every Databricks Runtime release. It is now generally available across all Databricks product offerings including: Azure Databricks, AWS cloud, GPU clusters and CPU clusters.

To use the Databricks Runtime for ML, simply select the ML version of the runtime when you create your cluster:


Ease of Use: Databricks Runtime for ML provides one-click access to pre-configured ML clusters, including the most popular ML libraries and frameworks like TensorFlow, Keras, PyTorch, Horovod, scikit-learn, XGBoost and their dependencies.
Reliability: ML frameworks are evolving at a frenetic pace. To ensure the most stable environment, integration tests are ran against the Runtime for each supported framework and on a daily basis, as well as stress-tests before integration of new libraries.
Performance: Databricks Runtime for ML includes unique performance improvements on GraphFrames, MLLib Logistic Regression and Decision Trees, as well as HorovodRunner, a simple API for distributed deep learning training.
Unification: Databricks Runtime for ML runs on the Databricks Unified Analytics Platform, allowing data teams to run all analytics processes on the same environment, from ETL to model building and deployment securely and at scale, leveraging Databricks Workspaces, Delta, and MLflow.


Pre-packaged ML frameworks

Conda Managed Runtime
Benefit from Conda integration for Python package management. All Python packages are installed in a single environment.

ML Frameworks Integration
The most popular ML libraries and frameworks are provided out-of-the-box including TensorFlow / TensorBoard, Keras, PyTorch, MLflow, Horovod / HorovodRunner, GraphFrames, scikit-learn, XGboost, numpy, MLeap, and Pandas.

Optimized performance for scale

Optimized TensorFlow
Benefit from TensorFlow CUDA-optimized version on GPU clusters, and Intel MKL-DNN optimized TensorFlow package on Intel CPUs for maximum performance.

Quickly migrate your single node deep learning training code to run in a Databricks cluster with HorovodRunner, a simple API that abstracts complications faced when using Horovod for distributed training.

Optimized MLlib Logistic Regression and Tree Classifiers
The most popular estimators have been optimized as part of the Databricks Runtime for ML to provide you with up to 40% speed-up compared to Apache Spark 2.4.0.

Optimized GraphFrames
Run GraphFrames 2-4 times faster and benefit from up to 100 times speed-up for Graph queries, depending on the workloads and data skew.

Optimized Storage for Deep Learning Workloads
Leverage high-performance solutions on Azure & AWS for data loading and model checkpointing, both of which are critical to deep learning training workloads.

Unified with Databricks Unified Analytics Platform

Databricks Delta
Included in the runtime, Databricks Delta – the next generation analytics engine – allows teams to build robust and performant ML pipelines, including ETL and data prep at scale.

Managed MLflow
Included in the runtime, MLflow allows for end-to-end management of the ML lifecycle, from experiment tracking, to project reproducibility, and model deployment.

GPU Instances
Cross-cloud support on both Amazon P2/P3 instances, and Azure NC and NCv3 series.

Ready to get started?