Skip to main content

Databricks for GxP

David Veuve
Douglas Moore
Amir Kermany
Michael Sanky
Share this post

What is GxP?

GxP stands for "Good x Practices," where x refers to a specific discipline, such as clinical, manufacturing, or laboratory. The goal of GxP compliance is to ensure that a compliant process runs reliably, can survive failures and human error, and can be relied upon. For example, there are a collection of guidelines, standards, and regulations that govern the creation and manufacture of drugs, and it's important to all of us that the final product provided to consumers can be relied upon. In the United States, many GxP guidelines are described in Federal regulations such as 21CFR Part 11.

When data is being processed in a workflow that falls under GxP, the data processing infrastructure generally also falls under GxP. The infrastructure that stores and processes that data may need to be qualified to verify that the software/infrastructure is properly installed and configured. Qualification can be achieved by defining Standard Operating Procedures (SOP) for system installation or deployment, and by monitoring and auditing the configuration of the installed infrastructure.

Any software algorithms used may also need to be validated to verify that the algorithm robustly produces a correct result. For software, validation is normally achieved through strong software development lifecycle (SDLC) processes.

The requirements imposed by GxP are similar to the requirements imposed in other regulated settings, such as CLIA/CAP requirements for validating results returned in clinical laboratories.

Overview of GxP-compliant workloads on Databricks

Databricks supports GxP-compliant workloads, with many customers now running GxP-compliant workloads, particularly in manufacturing and R&D.

The majority of GxP-compliance relates to the design of the workloads that the customer chooses to run. As Databricks is a compute platform, customers will leverage SOPs for deployment and monitoring along with a strong SDLC for their development pipeline. Databricks is well suited to support both these, due to the abundance of API support, log availability, and common integration with the continuous integration / continuous development (CI/CD) pipelines that support a strong SDLC.

Databricks recently has released documentation that provides guidance to how customers can install, configure, and validate Databricks in GxP-compliant environments.

Please learn more at our Security and Trust Center, here!

Try Databricks for free

Related posts

See all Industries posts