Amir Kermany

Health and Life Sciences Solutions Architect , Databricks

Amir Kermany is a Health and Life Sciences Solutions Architect at Databricks, where he leverages his expertise in genomics and machine learning to help companies in the space to solve their problems in generating actionable insights from vast amounts of health related datasets.

Amir’s past positions include Sr. Staff Scientist at AncestryDNA, Sr. Data Scientist at Shopify, Postdoctoral Scholar at the Howard Hughes Medical Institute and the University of Montreal. He holds a PhD in Mathematical Biology, MSc in Electrical Engineering and BSc. in Physics.

Past sessions

Healthcare, life sciences, and agricultural companies are generating petabytes of data, whether through genome sequencing, electronic health records, imaging systems, or the Internet of Medical Things. The value of these datasets grows when we are able to blend them together, such as integrating genomics and EHR-derived phenotypes for target discovery, or blending IoMT data with medical images to predict patient disease severity. In this session, we will look at the challenges customers face when blending these data types together. We will then present an architecture that uses the Databricks Unified Data Analytics Platform to unify these data types into a single data lake, and discuss the use cases this architecture can empower. We will then dive into a workload that uses the whole genome regression method from Project Glow to accelerate the joint analysis of genotypes and phenotypes data. Afterwards, Frank Austin Nothaft, Technical Director for Healthcare and Life Sciences, will be available to answer questions about this solution or any other use case questions you may have across healthcare, the life sciences, or agriculture.

Speaker: Amir Kermany

Summit 2020 Generalized SEIR Model on Large Networks

June 25, 2020 05:00 PM PT

SEIR model is a widely used model for simulating the spread of infectious diseases. In its simplest form, the SEIR model assumes that individuals in the population can assume any of the four states: Susceptible, Exposed, Infected and Recovered (or Removed), and the evolution of the system is modeled as a system of ordinary differential equations. Although this simple model performs well in modeling large dense populations, it does not capture population substructure and the effect of variation in interactions.

To address these issues, the general SEIR model models the population as a network where nodes are individuals and edges represent interactions between individuals. This model has attracted more attention during the Covid19 pandemic and there are python implementations that run the simulation on a single node.

In this talk, we discuss implementing the generalized SEIR model using Spark and graph analysis libraries such as GraphFrames and use stochastic simulation methods to predict the spread of Covid19 using Databricks.