AutoML on Databricks

Automating Machine Learning pipelines at scale

AutoML on Databricks automates Machine Learning pipelines from feature engineering, model search, hyperparameter tuning, and inference while providing data scientists with the flexibility and control they need.

How it works

Databricks automates various steps of the data science workflow including augmented data preparation, visualization, feature engineering, hyperparameter tuning, model search, and finally automatic model tracking, reproducibility, and deployment, through a combination of native product offerings, partnerships, and custom solutions for a fully controlled and transparent AutoML experience.

See for example how you can run hyperparameter tuning at scale on Databricks with enhanced Hyperopt and MLflow integration:


Scalability: Automatically scale up and down your workloads and speed up training time with out-of-the-box optimizations for the most popular ML frameworks.
Control: Choose algorithms better suited for the task in either single node or multi-node environment, and limit the number of runs to keep costs down.
Ease of use: Automatically log results with MLflow tracking and parallelize hyperparameter search with Hyperopt on Databricks.
Unification: Run all AutoML steps on the same platform, from ETL to model training and inference, securely, collaboratively, and at scale.


Out of the Box

MLflow Experiments Tracking
Track, compare, and visualize hundreds of thousands of experiments using open source or Managed MLflow.

Automated Hyperparameter Tuning for Distributed Machine Learning
Deep integration with PySpark MLlib’s Cross Validation to automatically track MLlib experiments in MLflow.

Automated Hyperparameter Tuning for Single-node Machine Learning
Optimized and distributed hyperparameter search with enhanced Hyperopt and automated tracking to MLflow.

Automated Model Search for Single-node Machine Learning
Optimized and distributed conditional hyperparameter search with enhanced Hyperopt and automated tracking to MLflow.

Databricks Labs

Databricks Labs AutoML Toolkit
Automated end-to-end model building pipeline is available via Databricks Labs custom solutions. Contact us for more information.

Featured Partners

Microsoft Azure Machine Learning
Azure Databricks integrates with Microsoft Azure Machine Learning and enables access to the service’s automated machine learning capabilities, and together these provide an end-to-end solution for machine learning on Azure.

DataRobot integration on Databricks brings the power of auto-modeling to Databricks users, allowing them to quickly determine and use the best machine learning model for their problem.


Ready to get started?