by Jeremy Lewallen, Kent Marten, Ina Felsheim and Maggie Wu
As the season of giving approaches, we at Databricks have been making our list and checking it twice--but instead of toys and treats, we've been wrapping up powerful performance improvements for our users. Through analyzing billions of production queries and listening closely to our community's wishes, we're excited to deliver a package of enhancements that make your data workloads run faster and more efficiently than ever.
Just as Santa's workshop crafts everything from traditional wooden toys to the latest electronic gadgets, Databricks SQL has become the ultimate data workshop, expertly handling diverse workloads for users of all needs. Some teams need robust ETL engines to power their data assembly lines, while others require interactive dashboards for instant insights, and still others seek powerful tools for data exploration and discovery. By carefully analyzing customer feedback and usage patterns across billions of queries, we've identified the top items on our users' wish lists:
At Databricks, we understand that performance is paramount for delivering a seamless user experience and optimizing costs. At the Data and AI Summit (DAIS) 2024, we introduced the Databricks Performance Index, intended to measure the impact of our AI performance optimizations on real-world workloads. A little over five months later, we're proud to announce that Databricks SQL is now 77% faster than when it launched in 2022. The Databricks Performance Index is derived statistically from repeating workloads, accounting for changes irrelevant to the engine, and computed against billions of production queries. Lower is better.
This isn't just a benchmark. We track millions of real customer queries that run repeatedly over time. Analyzing these similar workloads allows us to observe a 77% speed improvement, reflecting the cumulative impact of our continued optimizations.
In other words, if you were using Databricks SQL six months ago for BI workloads, those same workloads are now, on average, 14% faster—and you didn’t have to make any changes to enjoy these improvements, like a touch of Santa’s magic.
As organizations scale their analytics workloads on Databricks SQL, three key areas consistently emerge as priorities for optimization: complex joins that slow query performance, supporting concurrent workloads seamlessly, and accelerating queries for both beginners and experts. Based on analysis across our customer base, we've developed targeted performance improvements to address each of these areas. Here are some examples:
You can try all of these improvements now. Predictive Optimization with statistics is now in Gated Public Preview - sign up here to ensure your queries run faster and more consistently without manual tuning.
Reducing the total cost of ownership is a crucial priority for Databricks, and our latest improvements are designed to deliver substantial savings for our customers.
Building on our earlier advances this year that made downscaling 5x faster than our 2023 AI models, we've further refined our algorithms to handle additional scenarios even more efficiently. These latest improvements allow Databricks SQL to detect and release idle compute resources more rapidly, leading to reduced DBU compute expenses for our customers. With faster downscaling and improved TCO, we're wrapping up the year with a gift that keeps on giving: more savings!
Enhanced compression: We're rolling out an advanced data compression method, which promises even more significant cost savings by reducing data storage sizes and improving I/O efficiency. This move will further lower your storage expenses while maintaining high performance.
The greatest gift is time. Our engineers have been working hard on productivity and user interface improvements that will reduce the time needed to do tasks. We do this by incorporating AI to automate tasks, by reducing friction as you move between tools in your data ecosystem, serverless and more. Like a new bicycle, these gifts are so big that they get their own gift bags and bows. Here are some highlights:
Let Databricks SQL give you the gift of enhanced performance and reduced costs this holiday season. Whether running ETL pipelines, powering business intelligence tools, or conducting exploratory data analysis, our latest improvements are designed to help you achieve more with less.
Ready to experience these benefits firsthand? Contact your Databricks representative to start a proof-of-concept today and discover how Databricks SQL can transform your data operations. Our team is here to support you every step of the way, ensuring you maximize the value of your data intelligence platform.
What's at the top of every data team's wish list this year? It’s no secret–the best data warehouse is a lakehouse! Unwrap your free trial of Databricks SQL today.
To dive deeper into our performance optimizations and cost-saving features, check out our previous blog post: Databricks SQL Year in Review (Part I): AI-optimized Performance and Serverless Compute. Stay tuned for the next iteration of Performance and Total Cost of Ownership improvements in the first part of 2025.