SESSION

Migrating and Optimizing Large-Scale Streaming Applications with Databricks (repeated)

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPEBreakout
TRACKData Engineering and Streaming
INDUSTRYMedia and Entertainment
TECHNOLOGIESApache Spark, Developer Experience, Orchestration
SKILL LEVELIntermediate
DURATION40 min
DOWNLOAD SESSION SLIDES

This session is repeated.

 

Our large-scale streaming application processes hundreds of billions of ad events daily at over 5GB/s. It transforms, joins, and routes these ad events to hundreds of heterogeneous destinations, enabling real-time analytics, batch reporting, ML-based forecasting, and streaming ad log delivery for programmatic ad campaigns. In this session, we will discuss how we rearchitected, redeveloped, and migrated this massive application with over 30K lines of code to a Databricks Spark Structured Streaming architecture. We'll share lessons learned, cover the substantial benefits gained, and detail how we enhanced performance through various memory-related optimizations, Kinesis parameter tuning, parallelizing the output stage within each micro-batch, and other tweaks. We'll introduce FreeWheel, programmatic advertising, the architecture of the larger data platform that incorporates this streaming application, and our robust monitoring and observability solution. Finally, we'll highlight several Databricks features that enhanced our development experience, such as the Databricks AI assistant.

SESSION SPEAKERS

Sharif Doghmi

/Lead Software Engineer
FreeWheel, A Comcast Company

Donghui Li

/Lead Software Engineer
FreeWheel, A Comcast Company