SESSION

Towards Multi-Table Transactions in Delta Lake

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPELightning Talk
TRACKData Lakehouse Architecture
INDUSTRYEnterprise Technology, Health and Life Sciences, Travel and Hospitality, Financial Services
TECHNOLOGIESDelta Lake, Governance
SKILL LEVELIntermediate
DURATION20 min
DOWNLOAD SESSION SLIDES

This talk discusses Coordinated Commits, a new commit protocol for Delta Lake that changes the source of commit atomicity from the object store to an external commit coordinator (e.g., HMS/Unity Catalog/Glue) that will help us provide flexibility in how transactions are performed, laying out the foundation for advanced features such as multi-statement transactions. Delta was originally built on the premise that cloud storage is the source of truth. However, cloud storage has limited primitives for atomicity; more specifically, object stores lack the means to perform atomic commits for more than a single write/statement. In this talk, we talk about the new commit protocol, Coordinated Commits, that aims to solve the following:

 

  • Support multi-table-multi-statement transactions.
  • Provide reliable commit semantics even when the underlying object store lacks put-if-absent semantics (e.g., S3).
  • Data governance overwrite operations.

SESSION SPEAKERS

Prakhar Jain

/Staff Software Engineer
Databricks