Unlocking the Lakehouse with Efficient Data Pipelines
OVERVIEW
EXPERIENCE | In Person |
---|---|
TYPE | Breakout |
TRACK | Data Lakehouse Architecture |
INDUSTRY | Enterprise Technology |
TECHNOLOGIES | Apache Spark, Delta Lake |
SKILL LEVEL | Intermediate |
DURATION | 40 min |
DOWNLOAD SESSION SLIDES |
Capital One is a pioneer in data-driven digital transformation and is one of the earliest cloud-first enterprises. Data is critical to key decisions we make - big or small, from launching innovative products and building exceptional banking experiences to running automated vulnerability remediations or automating cloud cost optimizations. Capital One’s centralized data lake lies at the core of this data-driven ecosystem. We adopted the open ecosystem model of Lakehouse to build our data lake, which enables us to move rapidly, fueling data-driven decision-making to change banking for good. Join us in this breakout session to discuss how Capital One enabled its centralized data lake by building an efficient, resilient, and near real-time data ingestion pipeline using Delta Lake and Databricks. As we walk through the journey, we'll share key learnings and best practices to make the most out of the Databricks platform.
SESSION SPEAKERS
Usman Zubair
/Lead Technologist
Databricks
Prabodh Mhalgi
/Sr. Lead Data Engineer
Capital One