Databricks Certification and Badging

The new standard for lakehouse training and certifications

Associate   |   Professional

Databricks Certified Data Engineer Associate

The Databricks Certified Data Engineer Associate certification exam assesses an individual’s ability to use the Databricks Lakehouse Platform to complete introductory data engineering tasks. This includes an understanding of the Lakehouse Platform and its workspace, its architecture, and its capabilities. It also assesses the ability to perform multi-hop architecture ETL tasks using Apache Spark SQL and Python in both batch and incrementally processed paradigms. Finally, the exam assesses the tester’s ability to put basic ETL pipelines and Databricks SQL queries and dashboards into production while maintaining entity permissions. Individuals who pass this certification exam can be expected to complete basic data engineering tasks using Databricks and its associated tools.

Registration

In order to achieve this certification, earners must pass a certification exam. In order to achieve this certification, please either log in or create an account in our certification platform.

Learning Pathway

This certification is part of the Data Engineer learning pathway.

Learning Path

background-image

Exam Details

Key details about the certification exam are provided below.

Minimally Qualified Candidate

The minimally qualified candidate should be able to:

  • Understand how to use and the benefits of using the Databricks Lakehouse Platform and its tools, including:
    • Data Lakehouse (architecture, descriptions, benefits)
    • Data Science and Engineering workspace (clusters, notebooks, data storage)
    • Delta Lake (general concepts, table management and manipulation, optimizations)
  • Build ETL pipelines using Apache Spark SQL and Python, including:
    • Relational entities (databases, tables, views)
    • ELT (creating tables, writing data to tables, cleaning data, combining and reshaping tables, SQL UDFs)
    • Python (facilitating Spark SQL with string manipulation and control flow, passing data between PySpark and Spark SQL)
  • Incrementally process data, including:
    • Structured Streaming (general concepts, triggers, watermarks)
    • Auto Loader (streaming reads)
    • Multi-hop Architecture (bronze-silver-gold, streaming applications)
    • Delta Live Tables (benefits and features)
  • Build production pipelines for data engineering applications and Databricks SQL queries and dashboards, including:
    • Jobs (scheduling, task orchestration, UI)
    • Dashboards (endpoints, scheduling, alerting, refreshing)
  • Understand and follow best security practices, including:
    • Unity Catalog (benefits and features)
    • Entity Permissions (team-based permissions, user-based permissions)

Duration

Testers will have 90 minutes to complete the certification exam.

Questions

There are 45 multiple-choice questions on the certification exam. The questions will be distributed by high-level topic in the following way:

  • Databricks Lakehouse Platform – 24% (11/45)
  • ELT with Spark SQL and Python – 29% (13/45)
  • Incremental Data Processing – 22% (10/45)
  • Production Pipelines – 16% (7/45)
  • Data Governance – 9% (4/45)

Cost

Each attempt of the certification exam will cost the tester $200. Testers might be subjected to tax payments depending on their location. Testers are able to retake the exam as many times as they would like, but they will need to pay $200 for each attempt.

Test Aids

There are no test aids available during this exam.

Programming Language

The certification exam will provide data manipulation code in SQL when possible. In all other cases, code will be in Python.

Expiration

Because of the speed at which the responsibilities of a data engineer and capabilities of the Databricks Lakehouse Platform change, this certification is valid for 2 years following the date on which each tester passes the certification exam.

Preparation

In order to learn the content assessed by the certification exam, candidates should take one of the following Databricks Academy courses:

Candidates are also able to learn more about the certification exam by taking the certification exam’s overview course (coming soon).

Before taking the exam, it is recommended that candidates complete the practice exam.

Frequently Asked Questions

In order to view answers to frequently asked questions (FAQs), please refer to Databricks Academy FAQ document.