Healthcare and life science organizations generate petabytes of structured and unstructured data, including genomics, clinical trial data, EHR data, medical images, streaming IoT device data and more. By aggregating this data into a health lakehouse, organizations can build a single longitudinal view of patient health that can better inform the development and delivery of new treatments. This Solution Accelerator builds on Smolder, an open-source library for ingesting EHR data in real time, and provides a template for building a health lakehouse on Databricks.
Data analytics and machine learning in life sciences
Solution Accelerators for life sciences
Based on best practices from our work with the leading pharmaceutical and biotech organizations, we’ve developed Solution Accelerators for common analytics and machine learning use cases to save weeks or months of development time for your data scientists, engineers and analysts.
Precision prevention is focused on using data to identify patients at risk of developing a disease and then providing treatments and interventions to reduce that risk. Oftentimes, it’s far easier to prevent a disease than reverse it. This Solution Accelerator makes use of real-world data to help identify patients at risk.
Automating digital pathology
Modern imaging technologies enable healthcare and life science organizations to rapidly digitize high-resolution pathology slides. These large data sets can be used to build automated diagnostics with machine learning that, in turn, help organizations improve the efficiency and effectiveness of diagnosing and researching disease. This Solution Accelerator provides an automated methodology for rapidly identifying regions of metastases in whole slide images with deep learning.
Genetic association studies
Genome-wide association studies help identify genetic variations that are associated with a particular disease. This information can be used to better detect, treat and prevent chronic conditions such as asthma, cancer, diabetes and heart disease. This Solution Accelerator and open-source project provides a new scalable method for whole genome regressions.