Skip to main content
Industries header

How Datavant and Databricks are Transforming Life Sciences with Data Sharing

Mike Sanky
Adam Crown
Kenzie Alexander
Itai Weiss
Share this post

Data is the lifeblood of healthcare

Data is at the center of advancing medical breakthroughs and improving patient outcomes across healthcare and life sciences. Healthcare organizations, research institutions, and pharmaceutical companies alike recognize the immense value of collaboration and data sharing in driving innovation and advancing medical breakthroughs. In fact, data and analytics leaders that share data externally may generate three times more measurable economic benefit than those who do not, according to Gartner.

Enabling collaboration across all of this data is a challenge

Despite these benefits, a number of challenges inhibit effective data sharing across this complex and highly sensitive ecosystem. Healthcare organizations are looking for simpler ways to collaborate, but their data is often stored across varying cloud providers. To increase collaboration, organizations must protect sensitive patient information while navigating strict regulatory requirements and ethical obligations to ensure data protection.

Unlocking the power of healthcare data with Datavant and Databricks

All decisions in healthcare should be driven by data. However, the fragmentation of health data is one of the single greatest challenges facing the healthcare system today. Datavant's mission is to connect the world's health data to improve patient outcomes. Datavant is the healthcare industry's trusted, neutral, and ubiquitous technology for connecting health data, enabling access to an open ecosystem of hundreds of sources and users.

Datavant reduces the friction of data sharing across the healthcare industry through technology that protects the privacy of patients while supporting the linkage of patient health records across datasets.

Datavant is excited to bring this capability as a native solution for Databricks customers and to be one of the Databricks Marketplace launch partners.

Speeding time-to-insight with Datavant and Databricks

With Datavant and Databricks, organizations can unlock an ecosystem of new data sources, increasing speed-to-market, and upholding the highest levels of security. For example, life sciences organizations can enrich their specialty pharma and proprietary data with real-world data (RWD) for faster analytics, unlocking use cases spanning patient outcomes to drug effectiveness.

How it works

With Datavant, customers can tokenize and de-identify their data on Databricks, without moving the data. Datavant does this by enabling tokenization and linking natively on Databricks. Using Datavant, a customer can generate site-specific keys for each patient. These keys can then be used to match a patient's records within, and across, data sets. This integration unlocks new data flows, use cases, and connections through Databricks, making it possible to unify data shared via the Databricks Data Marketplace.

Cloud Data Flow

Looking ahead to what's next

Organizations want control over how their data is used and queried, and today they often lack visibility into who can access what. Enter data clean rooms, a solution to address data sharing challenges facing the healthcare industry.

Clean rooms - privacy-safe collaboration for all your data, analytics, and AI

So what are data clean rooms? Data clean rooms help organizations share and join their existing data, and run complex workloads in any language - Python, R, SQL, Java, and Scala - on the data while maintaining data privacy. Clean rooms establish a framework where only authorized stakeholders can interact with predefined queries, code, and notebooks, upholding the highest level of privacy protection for protected health information (PHI), a critical requirement in the healthcare industry.

Databricks is excited to announce our Clean Room solution

Databricks Clean Rooms, now in private preview, will enable organizations to compartmentalize data and run a clean room workflow from within their lakehouse, enabling collaboration and sharing of insights, without compromising patient privacy or regulatory compliance. This will be done in a few ways.

First, data will not be replicated when being used in the clean room - rather, it's accessed with Delta Sharing, an open source protocol. Delta Sharing enables clean room collaborators to work together across clouds, across regions and even across data platforms — all without requiring data movement.

Second, all collaborators will need to sign off on the analytical code being used in clean rooms.

The result - data can be shared within your lakehouse to derive insights, enabling collaboration in a secure environment with internal and external stakeholders on any cloud, in a privacy-safe way.

With Datavant, Datavant tokens will be usable within a Databricks Clean Room to enable analytics across datasets. The integration will allow customers to store a single version of datasets being used in clean rooms and transform data into a joinable key within the clean room itself.

Interested in learning more? Get started with Datavant and Databricks, today.

Try Databricks for free

Related posts

Platform blog

Introducing Data Clean Rooms for the Lakehouse

We are excited to announce data clean rooms for the Lakehouse, allowing businesses to easily collaborate with their customers and partners on any...
Platform blog

Announcing Public Preview of Databricks Marketplace

We are excited to announce the public preview of Databricks Marketplace , an open marketplace for all your data, analytics, and AI, powered...
Company blog

Introducing Lakehouse for Healthcare and Life Sciences

March 9, 2022 by Michael Sanky and Michael Ortega in News
Each of us will likely generate millions of gigabytes of health data in our lifetimes: medical and pharmacy claims, electronic medical records with...
See all Industries posts