We are excited to announce Azure Databricks support for Azure confidential computing (ACC) in preview! With this announcement, customers can run their Azure Databricks workloads on Azure confidential virtual machines (VMs). With support for ACC, customers can build an end-to-end data platform on the Databricks Lakehouse with increased confidentiality and privacy by encrypting data in use. This builds on support for customer-managed keys (CMK) for encrypting data at rest.
This blog post will discuss confidential computing and its use cases, the security benefits of using Azure Databricks on Azure confidential computing (ACC), and our partnership with Microsoft.
Confidential computing is an industry term defined by the Confidential Computing Consortium (CCC). The CCC is a community at the Linux Foundation dedicated to defining and accelerating the adoption of confidential computing. They define confidential computing as: The protection of data in use by performing computations in a hardware-based, attested Trusted Execution Environment (TEE).
Organizations that require confidential computing are typically from regulated industries that handle and produce highly sensitive data subject to strict privacy laws and regulatory requirements. Confidential computing also attracts organizations with extremely valuable intellectual property that they want to keep secret.
By leveraging the advanced security of confidential computing, customers can process even their most sensitive data in the cloud, empowering them to unlock the full potential of AI. With this announcement, the Databricks Lakehouse platform offers customers a comprehensive solution for their data, analytics, and AI needs. Typical use cases that may require confidential computing include:
Customers can now feel empowered to use the Databricks Lakehouse platform for their most sensitive and regulated data. Azure Databricks on Azure confidential computing provides the following security and privacy benefits:
"Databricks and Microsoft have collaborated towards enabling customers with their Lakehouse workloads. We are pleased to be the first cloud provider to enable Databricks users to analyze their most sensitive data in the cloud by running their clusters on AMD SEV-SNP confidential VMs, allowing protection of this data while it is in use in memory."— Lindsey Allen, General Manager, Azure Databricks, Microsoft
We are excited to collaborate with Microsoft to bring Azure Databricks to Azure confidential computing. Microsoft has long been a thought leader in the field of confidential computing. When Azure introduced "confidential computing" in the cloud, they became the first cloud provider to offer confidential computing virtual machines and confidential container support in Kubernetes for customers to run their most sensitive workloads inside Trusted Execution Environments (TEEs).
Together, Databricks and Azure provide a robust and secure data platform for confidential computing. The confidential VMs used on ACC feature AMD EPYCTM processors that are designed to run a variety of workloads, including high performance computing, while protecting data with memory encryption provided by AMD SEV-SNP technology. These processors provide powerful, cost-effective delivery of a wide range of machine learning and AI workloads on confidential computing.
These VMs will be rolled out and made available for Azure Databricks users over the next few days. Review our documentation or watch the demo below to see how easy it is to get up and running quickly - you simply select an ACC VM for your workloads.
Tune into Microsoft Build this week to learn more about the recent innovations with Azure confidential computing. We hope to see you at our Data and AI Summit in San Francisco on June 26-29, 2023. Register today!