Data Reproducibility, Audits, Immediate Rollbacks, and other Applications of Time Travel with Delta Lake - Databricks

Data Reproducibility, Audits, Immediate Rollbacks, and other Applications of Time Travel with Delta Lake

Time travel is now possible with Delta Lake! We will uncover how Delta Lake makes Time Travel possible and why it matters to you. Through presentation, notebooks, and code, we will showcase several common applications and how they can improve your modern data engineering pipelines. Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark(TM). It provides snapshot isolation for concurrent read/writes. Enables efficient upserts, deletes and immediate rollback capabilities. It allows background file optimization through compaction and Z-Order partitioning achieving up to 100x performance improvements. In this presentation you will learn: What challenges Delta Lake solves How Delta Lake works under the hood Applications of new Delta Time Travel capability



« back
About Kyle Weller

Kyle Weller is a Program Manager on Microsoft's Azure Databricks team. After working as a Software Developer at a few startups, he joined Microsoft and built scalable data engineering platforms, and API solutions for massive services in Office and Bing. He was on the Cortana Data Science team driving the product measurement and growth strategy. He worked on the Azure Machine Learning team enabling R + Python execution in SQL Server. He currently partners with Databricks to make Azure the best place to run Apache Spark worloads. Connect with him on LinkedIn here: https://www.linkedin.com/in/AzureDatabricks/