We recently announced our partnership with Databricks to bring multi-cloud data clean room collaboration capabilities to every Lakehouse. Our integration with Databricks combines the best of Databricks's Lakehouse technology with Habu's clean room orchestration platform to enable collaboration across clouds and data platforms, and make outputs of collaborative data science tasks available to business stakeholders. In this blog post, we'll outline how Habu and Databricks achieve this by answering the following questions:
Let's get started!
Data clean rooms are closed environments that allow companies to safely share data and models without concerns about compromising security or consumer privacy, or exposing underlying ML model IP. Many clean rooms, including those provisioned by Habu, provide a low- or no-code software solution on top of secure data infrastructure, which vastly expands the possibilities for access to data and partner collaborations. Clean rooms also often incorporate best practice governance controls for data access and auditing as well as privacy-enhancing technologies used to preserve individual consumer privacy while executing data science tasks.
Data clean rooms have seen widespread adoption in industries such as retail, media, healthcare, and financial services as regulatory pressures and privacy concerns have increased over the last few years. As the need for access to quality, consented data increases in additional fields such as ML engineering and AI-driven research, clean room adoption will become ever more important in enabling privacy-preserving data partnerships across all stages of the data lifecycle.
In recognition of this growing need, Databricks debuted its Delta Sharing protocol last year to provision views of data without replication or distribution to other parties using the tools already familiar to Databricks customers. After provisioning data, partners can run arbitrary workloads in any Databricks-supported language, while the data owner maintains full governance control over the data through configurations using Unity Catalog.
Delta Sharing represented the first step towards secure data sharing within Databricks. By combining native Databricks functionality with Habu's state-of-the-art data clean room technology, Databricks customers now have the ability to share access to data without revealing its contents. With Habu's low to no-code approach to clean room configuration, analytics results dashboarding capabilities, and activation partner integrations, customers can expand their data clean room use case set and partnership potential.
Habu's integration with Databricks removes the need for a user to deeply understand Databricks or Habu functionality in order to get to the desired data collaboration business outcomes. We've leveraged existing Databricks security primitives along with Habu's own intuitive clean room orchestration software to make it easy to collaborate with any data partner, regardless of their underlying architecture. Here's how it works:
You may be wondering, how does Habu perform all of these tasks without putting my data at risk? We've implemented three additional layers of security on top of our existing security measures to cover all aspects of our Databricks pattern integration:
There are many benefits to our integrated solution with Databricks. Delta Sharing makes collaborating on large volumes of data from the Lakehouse fast and secure. Plus, the ability to share data from your medallion architecture in a clean room opens up new insights. And finally, the ability to run Python and other code in containerized packages will enable customers to train and verify ML to Large Language Models (LLM) on private data.
All of these security mechanisms that are inherent to Databricks, as well as the security and governance workflows built into Habu, will ensure you can focus not only on the details of the data workflows involved in your collaborations, but also on the business outcomes resulting from your data partnerships with your most strategic partners.
To learn more about Habu's partnership with Databricks, register now for our upcoming joint webinar on May 17, "Unlock the Power of Secure Data Collaboration with Clean Rooms." Or, connect with a Habu representative for a demo so you can experience the power of Habu + Databricks for yourself.