Session

From Imperative to Declarative Paradigm: Rebuilding a CI/CD Infrastructure Using Hatch and DABs

Overview

ExperienceIn Person
TypeBreakout
TrackData Engineering and Streaming
IndustryMedia and Entertainment
TechnologiesApache Spark, Databricks Workflows
Skill LevelIntermediate
Duration40 min

Building and deploying Pyspark pipelines to Databricks should be effortless.

 

However, our team at FreeWheel has, for the longest time, struggled with a convoluted and hard-to-maintain CI/CD infrastructure. It followed an imperative paradigm, demanding that every project implement custom scripts to build artifacts and deploy resources, and resulting in redundant boilerplate code and awkward interactions with the Databricks REST API.

 

We set our mind on rebuilding it from scratch, following a declarative paradigm instead. We will share how we were able to eliminate thousands of lines of code from our repository, create a fully configuration-driven infrastructure where projects can be easily onboarded, and improve the quality of our codebase using Hatch and Databricks Asset Bundles as our tools of choice. In particular, DAB has made deploying across our 3 environments a breeze, and has allowed us to quickly adopt new features as soon as they are released by Databricks.

Session Speakers

IMAGE COMING SOON

Luigi Di Tacchio

/Software Engineer
FreeWheel

IMAGE COMING SOON

Saswati Bhoi

/Sr. SRE
Comcast