SESSION

Databricks Devops for Pipelines

OVERVIEW

EXPERIENCEIn Person
TYPEPaid Training
TRACKPaid Training
DURATION240 min
  • Audience: Data engineers
  • Hands-on labs: Yes
  • Certification path: Databricks Certified Data Engineer Professional
  • Description: In this half-day course, you’ll learn how to implement software engineering best practices to develop, test and deploy DLT pipeline code into production. We’ll cover topics like how to modularize code with libraries, parameterization, metaprogramming, and portable expectations, how to configure unit testing and deployment environments for DLT pipelines, and how to monitor data pipelines to ensure continued operation and data quality by querying and visualizing metrics from event logs.
  • Pre-requisites: Ability to perform basic code development tasks using the Databricks workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc), intermediate programming experience with PySpark (extract data from a variety of file formats and data sources, apply a number of common transformations to clean data, reshape and manipulate complex data using advanced built-in functions), intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.), beginner experience configuring and scheduling data pipelines using the Delta Live Tables (DLT) UI.