SESSION

Distributed Deep Learning for Cancer Cell Typing and Tumor Purity

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPELightning Talk
TRACKData Science and Machine Learning
INDUSTRYHealth and Life Sciences
TECHNOLOGIESAI/Machine Learning, Apache Spark, Delta Lake
SKILL LEVELAdvanced
DURATION20 min
DOWNLOAD SESSION SLIDES

At Providence St. Joseph Health, we're pioneering digital pathology workflows using an AI/ML vision model for accurate tumor analysis from H&E stained slides. Leveraging Azure Databricks, our innovative approach distributes complex image processing tasks across a Spark cluster, achieving a tenfold speed increase per Whole Slide Image (WSI). Our focus includes overcoming OpenSlide file management challenges through caching across executors, implementing parallel processing with a pre-trained StarDist model for thousands of WSI tiles, and applying GIS-style spatial joins for precise cell labeling. This breakthrough significantly enhances our large-scale genomics research, propelling advancements in digital pathology.

SESSION SPEAKERS

Robert Kramer

/Principal Data Scientist
Providence Health & Services