An open, governed life-sciences workbench that stitches NVIDIA accelerated computing and NVIDIA BioNeMo open models for biology into one end-to-end discovery platform - running entirely inside your own Databricks environment
by Mark Lee and Srijit Nair
Life sciences leaders need domain-specific, production-ready AI built directly on their own governed data. Together, Databricks and NVIDIA are enabling this shift: by combining Databricks (Unity Catalog governance, MLflow, Model Serving, and serverless GPU compute) with NVIDIA BioNeMo Agent Toolkit, including NVIDIA CUDA-X libraries, Parabricks, and a growing catalog of biology and chemistry models such as Proteina-Complexa, customers can run specialized AI where the data already lives, rather than shipping sensitive data to third-party APIs.
This post focuses on one of the hardest applications of that combination: life-sciences R&D and drug discovery - work that can take years and billions in investment, on data that is overwhelmingly unstructured and sensitive, across genomics, transcriptomics, structural biology, and chemistry - disciplines that rarely share a common toolchain. Genesis Workbench is what this looks like in practice.
Genesis Workbench is an open blueprint for a life-sciences application on Databricks - a modular workbench that brings the major stages of computational drug discovery under one roof, one UI, and one governance model. Each scientific domain is an independently deployable module:
This platform transforms a standard toolbox into a cohesive scientific workbench. Best of all, the entire environment is easily deployable via a single script. Using a point-and-click UI powered by Databricks Apps, bench scientists can navigate the entire discovery workflow without writing code. The underlying architecture relies on open-source models managed in Unity Catalog, tracked via MLflow, and served on GPU endpoints. By centralizing both public and proprietary datasets with Databricks AI Search, we've entirely eliminated external API dependencies. Ultimately, this seamless setup connects every step of the process—allowing genomics findings to flow effortlessly into single-cell validation, target structure prediction, candidate docking, ADMET, and ranking.
By bringing every stage of discovery onto one Databricks-native and NVIDIA-accelerated platform, Genesis Workbench directly addresses four problems that have historically kept AI from delivering in life-sciences R&D:

Keeping non-computational scientists in the loop. A point-and-click React UI - with interactive 3D viewers and AI-generated, plain-language result interpretations - lets a biologist call variants, simulate a knockout, design a binder, and rank candidates without writing code, while computational colleagues retain full access to the underlying jobs, models, and artifacts with NVIDIA at every stage of the pipeline.
At nearly every stage, the heavy lifting is done by NVIDIA accelerated computing and models:
Discovery stage | NVIDIA technology | What it does in Genesis Workbench |
|---|---|---|
Genomics | Parabricks | Part of Genomics Workflow GPU-accelerated germline variant calling and annotation - surfacing pathogenic variants from data in your lakehouse |
Single Cell | RAPIDS-singlecell (part of scverse) | Part of Single Cell Workflow GPU-accelerated clustering, UMAP, and differential expression on large datasets at scale - turning an overnight batch job into interactive exploration |
Small Molecule | GenMol (NV-GenMol-89M-v2) | Part of Guided Molecule Design workflow Generates novel, synthesizable molecules from a seed scaffold in a closed generate→score→reseed loop, under hard constraints with optional docking in the reward |
Large Molecule | Proteina-Complexa | Part of Enzyme Design Workflow Flow-matching protein binder design and motif scaffolding (with ProteinMPNN + ESMFold) - from a target structure to ranked, designed binder candidates |
Various Stages | BioNeMo Recipes | Fine-tunes and runs inference with pre-packaged models in BioNeMo container on your data, on your infrastructure |
Looking ahead, we are focused on making the workbench even more accessible and powerful for scientific discovery. Our roadmap includes:
Genesis Workbench empowers scientists to securely drive the entire drug discovery process - from hypothesis to ranked therapeutics - without their data ever leaving the environment. By unifying GPU-accelerated tools like Parabricks, CUDA-X Data Science, Proteina-Complexa, GenMol, and BioNeMo Agent Toolkit under Unity Catalog governance, it provides an intuitive UI built specifically for bench scientists. This powerful in-silico pipeline ensures that only the highest-probability targets advance to the wet lab, dramatically reducing wasted time and resources. This is the promise of industry AI made concrete: bringing specialized, secure AI directly to your data.
Deploy Genesis Workbench today from our GitHub repository. We also provide Claude Code skills to assist you with deployments and modifications. We welcome contributions, so feel free to contribute back to the project if you can! If you are already a Databricks customer and interested in a live demo, please talk to your Databricks Account team.
Genesis Workbench is an open Databricks Industry Solutions blueprint.
Subscribe to our blog and get the latest posts delivered to your inbox.