Frequently Asked Questions About the Data LakehouseAugust 30, 2021 by Michael Armbrust, Bharath Gowda, Reynold Xin, Matei Zaharia and Ali Ghodsi in Platform Blog Question Index What is a Data Lakehouse? What is a Data Lake? What is a Data Warehouse? How is a Data Lakehouse different...
Monitoring ML Models With Model AssertionsJuly 22, 2021 by Daniel Kang, Deepti Raghavan, Peter Bailis and Matei Zaharia in Engineering Blog This is a guest post from the Stanford University Computer Science Department. We thank Daniel Kang, Deepti Raghavan and Peter Bailis of Stanford...
Introducing Delta Sharing: An Open Protocol for Secure Data SharingMay 26, 2021 by Matei Zaharia, Michael Armbrust, Steve Weis, Todd Greenstein and Cyrielle Simeone in Platform Blog Update: Delta Sharing is now generally available on AWS and Azure. Get an early preview of O'Reilly's new ebook for the step-by-step guidance...
An Update on Project Zen: Improving Apache Spark for Python UsersSeptember 4, 2020 by Hyukjin Kwon and Matei Zaharia in Solutions Apache Spark™ has reached its 10th anniversary with Apache Spark 3.0 which has many significant improvements and new features including but not limited...
Spark + AI Summit Europe is Expanding and Getting a New Name: Data + AI Summit EuropeSeptember 2, 2020 by Ali Ghodsi, Reynold Xin and Matei Zaharia in Company Blog Back in 2013, we held the first Spark Summit — a gathering of the Apache Spark™ community with leading contributors and production users...
Introducing the Next-Generation Data Science WorkspaceJune 25, 2020 by Clemens Mewald, Matei Zaharia and Cyrielle Simeone in Product At today’s Spark + AI Summit 2020, we unveiled the next generation of the Databricks Data Science Workspace: An open and unified experience...
MLflow Joins the Linux Foundation to Become the Open Standard for Machine Learning PlatformsJune 25, 2020 by Clemens Mewald, Matei Zaharia and Cyrielle Simeone in Product Watch Spark + AI Summit Keynotes here At today's Spark + AI Summit 2020, we announced that MLflow is becoming a Linux Foundation...
Introducing Apache Spark 3.0June 18, 2020 by Matei Zaharia, Reynold Xin, Xiao Li, Wenchen Fan and Yin Huai in Product We’re excited to announce that the Apache Spark TM 3.0.0 release is available on Databricks as part of our new Databricks Runtime 7.0...
Evolving the Databricks brandApril 29, 2020 by Matei Zaharia, Reynold Xin, Patrick Wendell, Ion Stoica, Andy Konwinski and Ali Ghodsi in Announcements Some brands start out as, well, brands. A lot of work goes into the concept and painting the picture before the business is...
What Is a Lakehouse?January 30, 2020 by Ben Lorica, Michael Armbrust, Reynold Xin, Matei Zaharia and Ali Ghodsi in Engineering Blog Read Building the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data warehouse...