Skip to main content

Fueled by the exponential growth in external data and AI for innovation, organizations across all industries are looking for effective ways to collaborate with their partners in a privacy-safe way. Some organizations have limited collaborative solutions and are often required to relinquish control over how their sensitive data is shared with little to no visibility into how their data is consumed. This creates a significant risk for potential data misuse and data privacy breaches. 

Organizations need an open, flexible, yet privacy-safe way to collaborate and do AI on data, and Databricks Clean Rooms meets these critical needs. As we recently announced at the Data + AI Summit this year, Clean Rooms is in Public Preview in AWS and Azure (Request access to preview here). Clean Rooms is powered by Delta Sharing and allows businesses to easily collaborate with their customers and partners on any cloud without compromising privacy or sharing sensitive data. Participants in a clean room can securely share and join their existing data, and run complex workloads using any language — such as Python, which provides native support for ML. When collaborating in a clean room, your data stays in place and you are always in control of where and how the data is being used.

Databricks Clean Rooms is built for enterprises that are looking at ways to help accelerate innovation with data-driven insights. For example, watch the recent Data + AI Summit session, “Collaboration with Databricks Clean Rooms and PETs” to hear from Mastercard and learn more about how they protect sensitive data by dynamically determining which privacy-enhancing technologies (PETs) to use based on their collaborators, data, and use cases.

Any language, any workload

Databricks Clean Rooms is built for any analytics and AI workload. Unlike many other existing solutions that limit functionality to only SQL queries on tabular data, Databricks Clean Rooms allows you to run your computations in Python. Having this flexibility helps enable both simple joins as well as complex computations for ML/AI use cases. Leveraging the full power of Databricks Notebooks, you can run SQL or Python for complex compute and ML/AI workloads. Collaborators can also use private libraries to keep sensitive algorithms or data processing logic hidden, which ensures your IP remains protected. Finally, more language support is on the way for Scala and Java coming soon. 

Any cloud, with no replication

Databricks Clean Rooms is built for collaboration across regions, clouds, and platforms. For example, collaborators from different clouds — such as one from AWS and another from Azure — can collaborate together with Databricks Clean Rooms.  This secure, open, flexible collaboration with Clean Rooms is powered by Delta Sharing. You can collaborate on all your data and AI, including non-tabular or unstructured data and AI models — all while protecting the privacy of the underlying data. 

Coming soon, collaboration across data platforms using the new Sharing for Lakehouse Federation feature from Delta Sharing (Request access to preview here). 

Any scale, any trust level

We understand the critical need for organizations to use clean rooms at scale. Databricks Clean Rooms offers robust collaboration and operational capabilities to meet this demand. 

Coming soon, with support for APIs, SQL commands, and built-in Databricks Workflows orchestration, you can easily automate and manage clean rooms for all your use cases. Multiple collaborators can work together in a Databricks Clean Room at different trust levels using different approval modes. You can also easily access your Clean Rooms outputs in Databricks Notebooks or in your Unity Catalog, enabling seamless integration into subsequent workflows.

How does Databricks Clean Rooms work? 

Even though Clean Rooms is a powerful tool, it is easy to set up and get started. 

First, you create a clean room by selecting your preferred cloud provider and region. The clean room can be created in any cloud or region, regardless of whichever you and your collaborators currently use. This creates a privacy-safe and isolated environment hosted by Databricks. Once the clean room is created, you and your collaborators can bring in your data — including unstructured data, tables, volumes and AI models — into the clean room using Delta Sharing. None of the participants in the clean room will be able to see or directly access each other’s data.

Finally, to perform an analysis, you can create a notebook with mutually agreed upon code and share this in the clean room. Then, your collaborator can run these notebook tasks which will be completed using serverless compute. Databricks Clean Rooms allows any collaborator to share a notebook into the clean room, have it approved, and then run it inside the clean room. This flexibility enables you to run any workload in a privacy-safe manner.

How Databricks Clean Rooms Works

Common Clean Rooms Use Cases

Many use cases are emerging for clean rooms across different industries. Let’s look at some of the common ones. 

Advertising & Media

Clean rooms enable advertisers and publishers to analyze campaign performance without compromising user privacy. With this approach, advertisers get a holistic view of campaign effectiveness across platforms while protecting publisher data privacy. One use case is lookalike modeling, which uses a ML model to find similar profiles in another collaborator’s dataset without sharing the raw, underlying data. This can be a powerful technique for a variety of scenarios, including reaching a niche audience, enrich target audience profile data to enhance conversions, run retargeting or refining existing targeting profiles. 

Within this industry, a strategic partner, LiveRamp provides Databricks Clean Rooms customers with identity-powered data infrastructure for customer modeling and analytics. 

“LiveRamp and Databricks Clean Rooms give marketers the tools they need to create amazing customer experiences, all while protecting privacy. Databricks customers can harness LiveRamp’s identity-powered data infrastructure to fuel better personalization, stronger collaboration, and greater accuracy for customer modeling and analytics – the dream combination  for any marketing team.”
— Mike Moreau, VP Operations, LiveRamp

Retail & Consumer Packaged Goods (CPG)

Retailers and manufacturers can use clean rooms to identify trends and optimize pricing strategies. This collaborative analysis bolsters the effectiveness of the retailer's media network by enabling more targeted advertising and providing valuable insights for campaign optimization. Another common use case is leveraging sales data for demand forecasting and inventory management.

Manufacturing 

Global manufacturers can collaborate with their partners to unlock data insights across their entire value chain with a clean room, such as driving operational efficiency with predictive maintenance. They can access data from installed sensors, raw data from their data pipeline, and also use ML models trained with historical data to help predict failures or maintenance windows. 

Healthcare & Life Sciences

Clean rooms are also valuable in healthcare and life sciences for collaborative research on patient data. Researchers from different institutions can analyze combined datasets to develop new treatments and improve patient outcomes, all while maintaining patient privacy.

Financial Services

Clean rooms are a game-changer for Know Your Customer (KYC) compliance in financial services. Institutions can securely share and analyze KYC data for faster customer onboarding, identification of potential money laundering activities, and improved overall risk management, all without revealing sensitive customer information. Fraud detection and prevention also helps financial institutions and third-party analytics providers (e.g., fintech companies, fraud detection firms) collaborate to distill key insights. Another use case is generating customer insights and personalization where financial institutions and third-party analytics providers can collaborate to help understand customer behavior and preferences for personalized financial products and services.

Getting Started with Databricks Clean Rooms

Databricks Clean Rooms enables privacy-safe collaboration to help you deliver on your data and AI initiatives. Submit your interest to join our Databricks Clean Rooms interest form prior to Public Preview being released. 

You can also watch our recent 2024 Data + AI Summit sessions about Clean Rooms to learn more about how it works and how we can help accelerate data-driven innovation: 

  • Collaboration with Databricks Clean Rooms and PETs is a customer-led session by Mastercard. Clean Rooms and Mastercard facilitate collaboration across multiple parties to solve modern data problems. Dive into the notebooks Mastercard uses to determine what privacy-enhancing technologies (PETs) automatically need to be applied based on the collaborators, data, and use cases without impacting the end-user experience. 
  • Getting Started with Databricks Clean Rooms shows you how to get started on analyzing shared data, and enable advanced use cases with Databricks Clean Rooms, such as working with data across platforms, training ML/AI models, enforcing privacy policies, incorporating proprietary libraries, analyzing unstructured data, auditing clean room actions, and others.
  • Secure Data and AI Collaboration with Databricks Clean Rooms covers the macro trends driving adoption and the common use cases for data clean rooms. This session also highlights the Mastercard use case with a demo. 
Try Databricks for free

Related posts

See all Platform Blog posts