Databricks Certification and Badging
The new standard for lakehouse training and certifications
Databricks Certified Associate Developer for Apache Spark
The Databricks Certified Associate Developer for Apache Spark certification exam assesses the understanding of the Spark DataFrame API and the ability to apply the Spark DataFrame API to complete basic data manipulation tasks within a Spark session. These tasks include selecting, renaming and manipulating columns; filtering, dropping, sorting, and aggregating rows; handling missing data; combining, reading, writing and partitioning DataFrames with schemas; and working with UDFs and Spark SQL functions. In addition, the exam will assess the basics of the Spark architecture like execution/deployment modes, the execution hierarchy, fault tolerance, garbage collection, and broadcasting. Individuals who pass this certification exam can be expected to complete basic Spark DataFrame tasks using Python or Scala.
In order to achieve this certification, earners must pass a certification exam. In order to achieve this certification, please either log in or create an account in our certification platform.
This certification is part of the Apache Spark learning pathway.
Key details about the certification exam are provided below.
Minimally Qualified Candidate
The minimally qualified candidate should be able to:
- Understanding the basics of the Spark architecture, including Adaptive Query Execution
- Apply the Spark DataFrame API to complete individual data manipulation task, including:
- selecting, renaming and manipulating columns
- filtering, dropping, sorting, and aggregating rows
- joining, reading, writing and partitioning DataFrames
- working with UDFs and Spark SQL functions
While it will not be explicitly tested, the candidate must have a working knowledge of either Python or Scala. The exam is available in both languages.
Testers will have 120 minutes to complete the certification exam.
There are 60 multiple-choice questions on the certification exam. The questions will be distributed by high-level topic in the following way:
- Apache Spark Architecture Concepts – 17% (10/60)
- Apache Spark Architecture Applications – 11% (7/60)
- Apache Spark DataFrame API Applications – 72% (43/60)
Each attempt of the certification exam will cost the tester $200. Testers might be subjected to tax payments depending on their location. Testers are able to retake the exam as many times as they would like, but they will need to pay $200 for each attempt.
The following test aids will be available to be used by candidates during the exam:
- Apache Spark API documentation for the language in which they’re taking the exam. An example of these test aids is available here: Python/Scala.
- A digital notepad to use during the active exam time – candidates will not be able to bring notes to the exam or take notes away from the exam
This certification exam is available in Python and Scala.
Because of the speed at which the responsibilities of a data engineer and capabilities of the Databricks Lakehouse Platform change, this certification is valid for 2 years following the date on which each tester passes the certification exam.
In order to learn the content assessed by the certification exam, candidates should take one of the following Databricks Academy courses:
- Instructor-led: Apache Spark Programming with Databricks
- Self-paced: Apache Spark Programming with Databricks (available in Databricks Academy)
In addition, candidates can learn more about the certification exam by taking the Certification Overview: Databricks Certified Associate Developer for Apache Spark Exam course.