Databricks Certification and Badging
The new standard for lakehouse training and certifications
Databricks Certified Machine Learning Professional
The Databricks Certified Machine Learning Professional certification exam assesses an individual’s ability to use Databricks Machine Learning and its capabilities to perform advanced machine learning in production tasks. This includes the ability to track, version, and manage machine learning experiments and manage the machine learning model lifecycle. In addition, the certification exam assesses the ability to implement strategies for deploying machine learning models. Finally, test-takers will also be assessed on their ability to build monitoring solutions to detect data drift. Individuals to pass this certification exam can be expected to perform advanced machine learning engineering tasks using Databricks Machine Learning.
In order to achieve this certification, earners must pass a certification exam. In order to achieve this certification, please either log in or create an account in our certification platform.
This certification is part of the Machine Learning learning pathway. Before attempting this certification, it is recommended that learners obtain the Machine Learning Associate certification.
Key details about the certification exam are provided below.
Minimally Qualified Candidate
The minimally qualified candidate should be able to:
- Track, version, and manage machine learning experiments, including:
- Data management with Delta Lake and Feature Store (creating and using tables)
- Experiment tracking with MLflow (logging models and metrics, querying past runs, loading models)
- Advanced experiment tracking (model signatures, input examples, nested runs, Databricks Autologging, hyperparameter tuning, artifact tracking)
- Manage the machine learning model lifecycle, including:
- Applying preprocessing logic in production environments (types of flavors, easing downstream use, saving/loading models)
- Model management with MLflow Model Registry (capabilities, registering models, adding new model versions, transitioning model stages, deleting models and model versions)
- Automate model management pipelines (implement Model Registry Webhooks, incorporate usage of Databricks Jobs)
- Implement strategies for deploying machine learning models, including:
- Batch (batch deployment options, scaling single-node models with Spark UDFs, optimizing written prediction tables, scoring using Feature Store tables)
- Streaming (streaming deployment options, scaling single-node models in streaming pipelines)
- Real-time (real-time deployment options, RESTful deployment with MLflow Model Serving, querying MLflow Model Serving models)
- Build monitoring solutions for drift detection, including:
- Types of drift (data drift, concept drift)
- Drift tests and monitoring (numerical tests, categorical tests, input-label comparison tests)
- Comprehensive drift solutions (drift monitoring architectures)
Testers will have 120 minutes to complete the certification exam.
There are 60 multiple-choice questions on the certification exam. The exact distribution of questions across high-level topics will be provided upon release of the certification exam.
Each attempt of the certification exam will cost the tester $200. Testers might be subjected to tax payments depending on their location. Testers are able to retake the exam as many times as they would like, but they will need to pay $200 for each attempt.
There are no test aids available during this exam.
All machine learning code within this exam will be in Python. In the case of workflows or code not specific to machine learning tasks, data manipulation code could be provided in SQL.
Because of the speed at which the responsibilities of a machine learning practitioner and capabilities of the Databricks Lakehouse Platform change, this certification is valid for 2 years following the date on which each tester passes the certification exam.
In order to learn the content assessed by the certification exam, candidates should take one of the following Databricks Academy courses:
- Instructor-led: Machine Learning in Production
- Self-paced (coming soon to Databricks Academy): Machine Learning in Production
Candidates are also able to learn more about the certification exam by taking the certification exam’s overview course (coming soon).