Spark at Euclid - Databricks

Spark at Euclid

Download Slides

At Euclid, we are making the physical world just as machine readable, trackable, and actionable as cookies and click-throughs have made the online retail world. To do this, we process logs from sensors around the globe to understand the behaviors of people and their interactions with physical retail locations. This challenging task requires us to model each user’s behavior at a device level, meaning that we design, train, and deploy thousands of machine learning models daily. We have recently introduced Spark into the core of our analytics stack. Doing so has enabled greater flexibility in our analysis, improved accuracy, reporting, and testing. There are two parts that we intend to discuss: the technological and programmatic aspects of switching to a strongly typed system for our small engineering team, and the continued challenges we face in deploying daily-tuned random forest and naive Bayes models at scale.