Session

From Imperative to Declarative Paradigm: Rebuilding a CI/CD Infrastructure Using Hatch and DABs

Overview

Experience	In Person
Type	Breakout
Track	Data Engineering and Streaming
Industry	Media and Entertainment
Technologies	Apache Spark, Databricks Workflows
Skill Level	Intermediate
Duration	40 min

Building and deploying Pyspark pipelines to Databricks should be effortless.

However, our team at FreeWheel has, for the longest time, struggled with a convoluted and hard-to-maintain CI/CD infrastructure. It followed an imperative paradigm, demanding that every project implement custom scripts to build artifacts and deploy resources, and resulting in redundant boilerplate code and awkward interactions with the Databricks REST API.

We set our mind on rebuilding it from scratch, following a declarative paradigm instead. We will share how we were able to eliminate thousands of lines of code from our repository, create a fully configuration-driven infrastructure where projects can be easily onboarded, and improve the quality of our codebase using Hatch and Databricks Asset Bundles as our tools of choice. In particular, DAB has made deploying across our 3 environments a breeze, and has allowed us to quickly adopt new features as soon as they are released by Databricks.

Session Speakers

IMAGE COMING SOON

Luigi Di Tacchio

/Software Engineer
FreeWheel

IMAGE COMING SOON

Saswati Bhoi

/Sr. SRE
Comcast