Databricks Certification and Badging

The new standard for lakehouse training and certifications

Databricks Certified Associate Developer for Apache Spark

The Databricks Certified Associate Developer for Apache Spark certification exam assesses the understanding of the Spark DataFrame API and the ability to apply the Spark DataFrame API to complete basic data manipulation tasks within a Spark session. These tasks include selecting, renaming and manipulating columns; filtering, dropping, sorting, and aggregating rows; handling missing data; combining, reading, writing and partitioning DataFrames with schemas; and working with UDFs and Spark SQL functions. In addition, the exam will assess the basics of the Spark architecture like execution/deployment modes, the execution hierarchy, fault tolerance, garbage collection, and broadcasting. Individuals who pass this certification exam can be expected to complete basic Spark DataFrame tasks using Python or Scala.

Registration

In order to achieve this certification, earners must pass a certification exam. In order to achieve this certification, please either log in or create an account in our certification platform.

Learning Pathway

This certification is part of the Apache Spark learning pathway.

Learning Path

background-image

Exam Details

Key details about the certification exam are provided below.

Minimally Qualified Candidate

The minimally qualified candidate should be able to:

  • Understanding the basics of the Spark architecture, including Adaptive Query Execution
  • Apply the Spark DataFrame API to complete individual data manipulation task, including: 
    • selecting, renaming and manipulating columns
    • filtering, dropping, sorting, and aggregating rows
    • joining, reading, writing and partitioning DataFrames
    • working with UDFs and Spark SQL functions

While it will not be explicitly tested, the candidate must have a working knowledge of either Python or Scala. The exam is available in both languages.

Duration

Testers will have 120 minutes to complete the certification exam.

Questions

There are 60 multiple-choice questions on the certification exam. The questions will be distributed by high-level topic in the following way:

  • Apache Spark Architecture Concepts – 17% (10/60)
  • Apache Spark Architecture Applications – 11% (7/60)
  • Apache Spark DataFrame API Applications – 72% (43/60)

Cost

Each attempt of the certification exam will cost the tester $200. Testers might be subjected to tax payments depending on their location. Testers are able to retake the exam as many times as they would like, but they will need to pay $200 for each attempt.

Test Aids

The following test aids will be available to be used by candidates during the exam:

  • Apache Spark API documentation for the language in which they’re taking the exam. An example of these test aids is available here: Python/Scala.
  • A digital notepad to use during the active exam time – candidates will not be able to bring notes to the exam or take notes away from the exam

Programming Language

This certification exam is available in Python and Scala.

Expiration

Because of the speed at which the responsibilities of a data engineer and capabilities of the Databricks Lakehouse Platform change, this certification is valid for 2 years following the date on which each tester passes the certification exam.

Preparation

In order to learn the content assessed by the certification exam, candidates should take one of the following Databricks Academy courses:

In addition, candidates can learn more about the certification exam by taking the Certification Overview: Databricks Certified Associate Developer for Apache Spark Exam course.

Before taking the exam, it is recommended that you complete the practice exam for your language of choice: Python or Scala.

Frequently Asked Questions

In order to view answers to frequently asked questions (FAQs), please refer to Databricks Academy FAQ document.