Engineering | Databricks Blog

Page 57

Meltdown and Spectre: Exploits and Mitigation Strategies

January 16, 2018 by Chris Stevens, Nicolas Poggi, Thomas Desrosiers and Reynold Xin in Engineering

In an earlier blog post , we analyzed the performance impact of Meltdown and Spectre on big data workloads in the cloud. In...

Meltdown and Spectre's Performance Impact on Big Data Workloads in the Cloud

January 12, 2018 by Chris Stevens, Nicolas Poggi, Thomas Desrosiers and Reynold Xin in Engineering

Last week, the details of two industry-wide security vulnerabilities, known as Meltdown and Spectre , were released. These exploits enable cross-VM and cross-process...

Databricks Cache Boosts Apache Spark Performance

January 9, 2018 by Alicja Luszczak, Michał Szafrański, Michał Switakowski and Reynold Xin in Engineering

We are excited to announce the general availability of Databricks Cache, a Databricks Runtime feature as part of the Unified Analytics Platform that...

The Architecture of the Next CERN Accelerator Logging Service

December 14, 2017 by Jakub Wozniak in Solutions

This is a community guest blog from Jakub Wozniak , a software engineer and project technical lead at CERN physics laboratory, further expounding...

Transparent Autoscaling of Instance Storage

December 1, 2017 by Greg Owen, Srinath Shankar and Prakash Chockalingam in Platform

Big data workloads require access to disk space for a variety of operations, generally when intermediate results will not fit in memory. When...

Databricks Achieves AWS Machine Learning Competency Status

November 28, 2017 by Brian Dirking in Partners

Today we announced that Amazon has awarded Databricks with the Amazon Web Services (AWS) Machine Learning (ML) Competency status. This designation recognizes Databricks...

What AWS Per-Second Billing Means for Big Data Processing

November 6, 2017 by Prakash Chockalingam in Company

Databricks, the Unified Analytics Platform, has always been a cloud-first platform. We believe in the scalability and elasticity of the cloud so that...

Access Control for Databricks Jobs

November 1, 2017 by Yandong Mao, Yu Peng, Andrew Chen and Prakash Chockalingam in Company

Secure your production workloads end-to-end with Databricks’ comprehensive access control system Databricks offers role-based access control for clusters and workspace to secure infrastructure...

Continuous Integration & Continuous Delivery with Databricks

October 30, 2017 by Yu Peng, Andrew Chen and Prakash Chockalingam in Platform

Continuous integration and continuous delivery (CI/CD) is a practice that enables an organization to rapidly iterate on software changes while maintaining stability, performance...

Introducing Pandas UDF for PySpark

October 30, 2017 by Li Jin in Solutions

NOTE: Spark 3.0 introduced a new pandas UDF. You can find more details in the following blog post: New Pandas UDFs and Python...