SESSION

Unlocking the Lakehouse with Efficient Data Pipelines

OVERVIEW

EXPERIENCE	In Person
TYPE	Breakout
TRACK	Data Lakehouse Architecture
INDUSTRY	Enterprise Technology
TECHNOLOGIES	Apache Spark, Delta Lake
SKILL LEVEL	Intermediate
DURATION	40 min
DOWNLOAD SESSION SLIDES

Capital One is a pioneer in data-driven digital transformation and is one of the earliest cloud-first enterprises. Data is critical to key decisions we make - big or small, from launching innovative products and building exceptional banking experiences to running automated vulnerability remediations or automating cloud cost optimizations. Capital One’s centralized data lake lies at the core of this data-driven ecosystem. We adopted the open ecosystem model of Lakehouse to build our data lake, which enables us to move rapidly, fueling data-driven decision-making to change banking for good. Join us in this breakout session to discuss how Capital One enabled its centralized data lake by building an efficient, resilient, and near real-time data ingestion pipeline using Delta Lake and Databricks. As we walk through the journey, we'll share key learnings and best practices to make the most out of the Databricks platform.

Unlocking the Lakehouse with Efficient Data Pipelines

OVERVIEW

SESSION SPEAKERS

Usman Zubair

Prabodh Mhalgi