SESSION

Unlocking the Lakehouse with Efficient Data Pipelines

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPEBreakout
TRACKData Lakehouse Architecture
INDUSTRYEnterprise Technology
TECHNOLOGIESApache Spark, Delta Lake
SKILL LEVELIntermediate
DURATION40 min
DOWNLOAD SESSION SLIDES

Capital One is a pioneer in data-driven digital transformation and is one of the earliest cloud-first enterprises. Data is critical to key decisions we make - big or small, from launching innovative products and building exceptional banking experiences to running automated vulnerability remediations or automating cloud cost optimizations. Capital One’s centralized data lake lies at the core of this data-driven ecosystem. We adopted the open ecosystem model of Lakehouse to build our data lake, which enables us to move rapidly, fueling data-driven decision-making to change banking for good. Join us in this breakout session to discuss how Capital One enabled its centralized data lake by building an efficient, resilient, and near real-time data ingestion pipeline using Delta Lake and Databricks. As we walk through the journey, we'll share key learnings and best practices to make the most out of the Databricks platform.

SESSION SPEAKERS

Usman Zubair

/Lead Technologist
Databricks

Prabodh Mhalgi

/Sr. Lead Data Engineer
Capital One