Skip to main content
Page 1

Announcing the General Availability of the Databricks SQL Statement Execution API

Today, we are excited to announce the general availability of the Databricks SQL Statement Execution API on AWS and Azure, with support for...

Understanding Caching in Databricks SQL: UI, Result, and Disk Caches

Caching is an essential technique for improving the performance of data warehouse systems by avoiding the need to recompute or fetch the same...

Use Databricks Pools to Speed up your Data Pipelines and Scale Clusters Quickly

November 10, 2019 by Chris Stevens and David Meyer in
Data Engineering teams deploy short, automated jobs on Databricks. They expect their clusters to start quickly, execute the job, and terminate. Data Analytics...

Adventures in the TCP stack: Uncovering performance regressions in the TCP SACKs vulnerability fixes

Last month, we announced that the Databricks platform was experiencing network performance regressions due to Linux patches for the TCP SACKs vulnerabilities . The regressions were observed in less than 0.2% of cases when running the Databricks Runtime (DBR) on the Amazon Web Services (AWS) platform. In this post, we will dive deeper into our analysis that determined the TCP stack was the source of the degradation. We will discuss the symptoms we were seeing,

Meltdown and Spectre: Exploits and Mitigation Strategies

In an earlier blog post , we analyzed the performance impact of Meltdown and Spectre on big data workloads in the cloud. In...

Meltdown and Spectre's Performance Impact on Big Data Workloads in the Cloud

Last week, the details of two industry-wide security vulnerabilities, known as Meltdown and Spectre , were released. These exploits enable cross-VM and cross-process...