Power Up Your Lakehouse with Git Semantics & Delta Lake
OVERVIEW
EXPERIENCE | In Person |
---|---|
TYPE | Breakout |
TRACK | Data Lakehouse Architecture |
INDUSTRY | Enterprise Technology |
TECHNOLOGIES | Delta Lake, Developer Experience |
SKILL LEVEL | Intermediate |
DURATION | 40 min |
DOWNLOAD SESSION SLIDES |
The lakehouse architecture has become the backbone of big data operations today, however it doesn’t come without challenges. The challenge of data versioning (AKA time travel) presents itself in numerous areas of DataOps, including the ability to write/audit/publish to test and verify changes before releases, rolling back changes to a consistent and good known state, creating reproducible workloads that encapsulate multiple tables (and code!), and building economical, ad hoc dev/test environments with zero data copies. Luckily, data engineering has made quite a bit of progress, and there are great OSS tools that can help overcome these challenges. In this talk, we’ll present how Delta Lake and lakeFS together can help apply git-like semantics for improved time travel for lakehouses. Delta Lake delivers a linear history through table snapshots, while lakeFS adds a layer of branching and merging capabilities, resulting in improved data quality and economics for your operations.
SESSION SPEAKERS
Oz Katz
/CTO & Co-creator of lakeFS
lakeFS