We’re excited to release the Customer-managed key (CMK) public previews for Azure Databricks and Databricks workspaces on AWS (Amazon Web Services), with full support for production deployments. On Microsoft Azure, you can now use your own key to encrypt the notebooks and queries managed by Azure Databricks; this capability is available in the Premium pricing tier in all Azure Databricks regions. For those using AWS you can bring your own key to encrypt the data on DBFS and cluster volumes, available in the Enterprise pricing tier in all AWS regions supporting E2 architecture. We have received great feedback from our global customers during the corresponding private previews, as these capabilities allow them to unleash the full power of Databricks Lakehouse Platform to process highly sensitive and confidential data.
Let’s dive deeper into these capabilities.
CMK Managed Services for Azure Databricks
An Azure Databricks workspace is a managed application on the Azure Cloud that delivers enhanced security capabilities through a simple and well-integrated architecture. With the Customer-managed key from an Azure Key Vault instance, users can encrypt the notebooks, queries and secrets stored in the Azure Databricks regional infrastructure. This is a public preview release allowing you to leverage the capability for production deployments.
It’s already possible to bring your own key from Azure Key Vault to encrypt the data stored on DBFS (Blob Storage) and Azure-native data sources like ADLS Gen2 and Azure SQL. You can seamlessly process such data with Azure Databricks without having to configure any settings for a workspace.
Use Case | Status |
CMK Managed Services (notebooks, queries and secrets stored in control plane) | Public Preview |
CMK Workspace Storage (DBFS) | Generally Available (GA) |
CMK for your own data sources | Already works seamlessly |
CMK Workspace Storage for Databricks on AWS
A Databricks workspace on AWS delivers the same security capabilities as on Azure, described above. Use the customer-managed key from an AWS KMS instance to encrypt the data stored on DBFS and Cluster EBS Volumes. This is a public preview release. If you already configure an AWS account default encryption key for EBS volumes, we provide the flexibility to opt-out of using the CMK capability for Cluster EBS Volumes.
Also in public preview is the ability to use customer-managed keys to encrypt the notebooks, queries and secrets stored in the Databricks regional infrastructure. And it’s already possible to use a customer-managed key to encrypt your data on AWS-native data sources like S3 and RDS. Refer to this documentation to allow your Databricks clusters to encrypt / decrypt data on your non-DBFS S3 buckets.
Use Case | Status |
CMK Workspace Storage (DBFS and Cluster EBS Volumes) | Public Preview |
CMK Managed Services (notebooks, queries and secrets stored in control plane) | Public Preview |
CMK for your own data sources | Already works seamlessly |
Get started with the enhanced security capabilities by deploying Azure Databricks and Databricks on AWS Workspaces with customer-managed keys for Managed Services and Workspace Storage. Please refer to the following resources:
- CMK Managed Services for Azure Databricks
- CMK DBFS for Azure Databricks
- CMK Managed Services for Databricks on AWS
- CMK Workspace Storage for Databricks on AWS
Please refer to Platform Security for Enterprises for a deeper view into how we bring a security-first mindset while building the most popular lakehouse platform on Azure & AWS.