From Spaghetti Bowl Pipeline to DLT Efficiency
Overview
Experience | In Person |
---|---|
Type | Lightning Talk |
Track | Data Engineering and Streaming |
Industry | Health and Life Sciences |
Technologies | Databricks Workflows, DLT, Unity Catalog |
Skill Level | Beginner |
Duration | 20 min |
In today's data-driven world, the ability to efficiently manage and transform data is crucial for any organization. This presentation will explore the process of converting a complex and messy workflow into a clean and simple DLT pipeline at a large integrated health system, Intermountain Health.
Alteryx is a powerful tool for data preparation and blending, but as workflows grow in complexity, they can become difficult to manage and maintain. DLT, on the other hand, offers a more democratized, streamlined and scalable approach to data engineering, leveraging the power of Apache Spark and Delta Lake.
We will begin by examining a typical legacy workflow, identifying common pain points such as tangled logic, performance bottlenecks and maintenance challenges. Next, we will demonstrate how to translate this workflow into a DLT pipeline, highlighting key steps such as data transformation, validation and delivery.
Session Speakers
Peter Jones
/Analytics Engineer
Intermountain Healthcare