Project Lightspeed: Faster and Simpler Stream Processing With Apache SparkJune 28, 2022 by Karthik Ramasamy, Matei Zaharia, Reynold Xin, Michael Armbrust, Awez Syed, Ray Zhu, Alexander Balikov, Jerry Peng, Shrikanth Shankar and Sameer Paranjpye in Engineering Blog Streaming data is a critical area of computing today. It is the basis for making quick decisions on the enormous amounts of incoming...
Announcing General Availability of Databricks’ Delta Live Tables (DLT)April 5, 2022 by Michael Armbrust, Awez Syed, Paul Lappas, Erika Ehrli, Sam Steiny, Richard Tomlinson, Andreas Neumann and Mukul Murthy in Platform Blog Today, we are thrilled to announce that Delta Live Tables (DLT) is generally available (GA) on the Amazon AWS and Microsoft Azure clouds...
Databricks Delta Live Tables Announces Support for Simplified Change Data CaptureFebruary 10, 2022 by Michael Armbrust, Paul Lappas and Amit Kara in Platform Blog As organizations adopt the data lakehouse architecture, data engineers are looking for efficient ways to capture continually arriving data. Even with the right...
Frequently Asked Questions About the Data LakehouseAugust 30, 2021 by Michael Armbrust, Bharath Gowda, Reynold Xin, Matei Zaharia and Ali Ghodsi in Platform Blog Question Index What is a Data Lakehouse? What is a Data Lake? What is a Data Warehouse? How is a Data Lakehouse different...
Announcing the Launch of Delta Live Tables: Reliable Data Engineering Made EasyMay 27, 2021 by Michael Armbrust, Awez Syed and Sam Steiny in Platform Blog SIGN UP FOR PUBLIC PREVIEW As the amount of data, data sources and data types at organizations grow, building and maintaining reliable data...
Introducing Delta Sharing: An Open Protocol for Secure Data SharingMay 26, 2021 by Matei Zaharia, Michael Armbrust, Steve Weis, Todd Greenstein and Cyrielle Simeone in Platform Blog Update: Delta Sharing is now generally available on AWS and Azure. Get an early preview of O'Reilly's new ebook for the step-by-step guidance...
What Is a Lakehouse?January 30, 2020 by Ben Lorica, Michael Armbrust, Reynold Xin, Matei Zaharia and Ali Ghodsi in Engineering Blog Read Building the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data warehouse...
Delta Lake Now Hosted by the Linux Foundation to Become the Open Standard for Data LakesOctober 16, 2019 by Michael Armbrust and Reynold Xin in Platform Blog Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. At today’s Spark +...
Diving Into Delta Lake: Unpacking The Transaction LogAugust 21, 2019 by Burak Yavuz, Michael Armbrust and Brenner Heintz in Company Blog The transaction log is key to understanding Delta Lake because it is the common thread that runs through many of its most important...
How to Work with Avro, Kafka, and Schema Registry in DatabricksFebruary 15, 2019 by Wenchen Fan and Michael Armbrust in Solutions In the previous blog post , we introduced the new built-in Apache Avro data source in Apache Spark and explained how you can...