Alex is a senior data engineer at Eventbrite. He began using Spark in 2014 to build event based recommendations. Since then he has been building, extending, and optimizing Eventbrite’s data infrastructure on behalf of its analysts, data scientists, and other engineers. From ingesting new data to creating downstream processes, he has been a primary driver of Eventbrite’s growth in the world of big data.
October 22, 2021 03:04 PM PT
Deploying machine learning models seems like it should be a relatively easy task. Take your model and pass it some features in production. The reality is that the code written during the prototyping phase of model development doesn't always work when applied at scale or on "real" data. This talk will explore 1) common problems at the intersection of data science and data engineering 2) how you can structure your code so there is minimal friction between prototyping and production, and 3) how you can use Apache Spark to run predictions on your models in batch or streaming contexts.
You will take away how to address some of productionizing issues that data scientists and data engineers face while deploying machine learning models at scale and a better understanding of how to work collaboratively to minimize disparity between prototyping and productizing.
Session hashtag: #SAISDS2