Introducing Apache Spark™ 3.1
We are excited to announce the availability of Apache Spark 3.1 on Databricks as part of Databricks Runtime 8.0. We want to thank the Apache Spark™ community for all their valuable contributions to the Spark 3.1 release. Continuing with the objectives to make Spark faster, easier and smarter, Spark 3.1 extends its scope with the...
Spark + AI Summit Europe is Expanding and Getting a New Name: Data + AI Summit Europe
Back in 2013, we held the first Spark Summit — a gathering of the Apache Spark™ community with leading contributors and production users sharing their wisdom. Since the first event, Spark’s success has accelerated the evolution of data science, data engineering and analytics. As the data community has expanded, we’ve evolved the content and the...
Welcoming Redash to Databricks
This morning at Spark and AI Summit, we announced that Databricks has acquired Redash, the company behind the popular open source project of the same name. With this acquisition, Redash joins Apache Spark, Delta Lake, and MLflow to create a larger and more thriving open source system to give data teams best-in-class tools. I would...
Introducing Apache Spark 3.0
We’re excited to announce that the Apache SparkTM 3.0.0 release is available on Databricks as part of our new Databricks Runtime 7.0. The 3.0.0 release includes over 3,400 patches and is the culmination of tremendous contributions from the open-source community, bringing major advances in Python and SQL capabilities and a focus on ease of use...
Evolving the Databricks brand
Some brands start out as, well, brands. A lot of work goes into the concept and painting the picture before the business is ever launched. Databricks is different. It always has been and always will be an engineering-led company. Databricks’ model for innovation is inspired by the open-source community. This is where our roots run...
What is a Lakehouse?
Over the past few years at Databricks, we've seen a new data management architecture that emerged independently across many customers and use cases: the lakehouse. In this post we describe this new architecture and its advantages over previous approaches. Data warehouses have a long history in decision support and business intelligence applications. Since its inception...
Solving the World’s Toughest Problems with the Growing Open Source Ecosystem and Databricks
We started Databricks in 2013 in a tiny little office in Berkeley with the belief that data has the potential to solve the world’s toughest problems. We entered 2020 as a global organization with over 1000 employees and a customer base spanning from two-person startups to Fortune 10s. In this blog post, let’s take a...
Delta Lake Now Hosted by the Linux Foundation to Become the Open Standard for Data Lakes
At today’s Spark + AI Summit Europe in Amsterdam, we announced that Delta Lake is becoming a Linux Foundation project. Together with the community, the project aims to establish an open standard for managing large amounts of data in data lakes. The Apache 2.0 software license remains unchanged. Delta Lake focuses on improving the reliability...
Introducing Brickchain: Planet-scale Unified Analytics
Today we are excited to announce Brickchain, the next generation technology for zettabyte-scale analytics, by harnessing all the compute power on the planet. Brickchain is the most scalable, secure, and collaborative data technology ever invented. As you may know, Databricks was founded by the original creators of Apache Spark, a unified analytics engine that uses...
Introducing Apache Spark 2.4
UPDATED: 11/19/2018 We are excited to announce the availability of Apache Spark 2.4 on Databricks as part of the Databricks Runtime 5.0. We want to thank the Apache Spark community for all their valuable contributions to the Spark 2.4 release. Continuing with the objectives to make Spark faster, easier, and smarter, Spark 2.4 extends its...