I’m a Software Engineer based in NYC.
Conde Nast is a global leader in the media production space housing iconic brands such as The New Yorker, Wired, Vanity Fair, and Epicurious, among many others. Along with our content production, Conde Nast invests heavily in companion products to improve and enhance our audience's experience. One such product solution is Spire, Conde Nast's service for user segmentation, and targeted advertising for over a hundred million users. Spire consists of thousands of models, many of which require individual scheduling and optimization. From data preparation to model training to interference, we've built abstractions around the data flow, monitoring, orchestration, and other internal operations. In this talk, we explore the complexities of building large scale machine learning pipelines within Spire and discuss some of the solutions we've discovered using Databricks, MLflow, and Apache Spark. The key focus is on production-grade engineering patterns, the inner workings the required components, and the lessons learned throughout their development.