We all know how to create ML models, but the path to turning them into a highly scalable easy to use system by users is not always clear. What happens when you need to run thousands of them, on many different datasets, simultaneously and at a huge scale? AND, do it reliably so you can sleep well at night!!
To achieve exactly that, we’ve decided to go down the serverless route and build an anomaly detection system on top of it. We’ll go over the pros and cons of building such a system using serverless and when such an approach could work for you.
Our SpotLight anomaly detection system is capable of easily reusing ML models, and scale to run millions of time series simultaneously with ease. Our system eliminates manual work and allows our end users with no scientific background to set anomalies to detect in a plug and play way and get alerts in no time.
In this talk, we’ll walk you through the architecture and share useful ideas you can adopt and implement in your own projects.
Opher is a big data team lead at Nielsen. His team builds massive data pipelines that are cost effective and scalable (~250 Billion events/day). Their projects run on AWS, using Spark, serverless L...
I am an experienced Data Engineer with a history of big data and machine learning projects. I love to tackle big data and scale challenges and find ways to run the same workloads at 1/10 the processin...