Max Cantor

Software Engineer, Condé Nast

Max Cantor is a Software Engineer of Machine Learning at Condé Nast. He designs and maintains machine learning platforms that scale to thousands of models and terabytes of data in a production environment. He is an Insight Data Engineering Fellow, received a Masters degree in Cognitive Psychology and Cognitive Neuroscience at the University of Colorado Boulder, and graduated with Honors at the University of Michigan with a Bachelors degree in Psychology.

Past sessions

Condé Nast is a global leader in the media production space housing iconic brands such as The New Yorker, Wired, Vanity Fair, and Epicurious, among many others. Along with our content production, Condé Nast invests heavily in companion products to improve and enhance our audience's experience. One such product solution is Spire, Condé Nast’s service for user segmentation and targeted advertising for over a hundred million users.

While Spire started as a set of databricks notebooks, we later utilized DBFS for deploying Spire distributions in the form of Python Whls, and more recently, we have packaged the entire production environment into docker images deployed onto our Databricks clusters. In this talk, we will walk through the process of evolving our python distributions and production environment into docker images, and discuss where this has streamlined our deployment workflow, where there were growing pains, and how to deal with them.

In this session watch:
Harin Sanghirun, Machine Learning Engineer, Condé Nast
Max Cantor, Software Engineer, Condé Nast

[daisna21-sessions-od]

Conde Nast is a global leader in the media production space housing iconic brands such as The New Yorker, Wired, Vanity Fair, and Epicurious, among many others. Along with our content production, Conde Nast invests heavily in companion products to improve and enhance our audience's experience. One such product solution is Spire, Conde Nast's service for user segmentation, and targeted advertising for over a hundred million users. Spire consists of thousands of models, many of which require individual scheduling and optimization. From data preparation to model training to interference, we've built abstractions around the data flow, monitoring, orchestration, and other internal operations. In this talk, we explore the complexities of building large scale machine learning pipelines within Spire and discuss some of the solutions we've discovered using Databricks, MLflow, and Apache Spark. The key focus is on production-grade engineering patterns, the inner workings the required components, and the lessons learned throughout their development.