Data engineer at Avast with focus on machine learning pipelines and with experience in network security and ML research
November 18, 2020 04:00 PM PT
At Avast we complete over 17 million phishing detections a day, providing crucial online protection for this type of attacks.
In this talk Joao Da Silva and Yury Kasimov will present the MATS stack for productionisation of Machine Learning and their journey into integrating model tracking, storage, cross-system orchestration and model deployments for a complete and modern machine learning pipeline.
One can integrate MATS stack into their existing ecosystem without disruption, no need to migrate to clean AWS all of a sudden.
MATS stack consists of adopting MLFLow, Airflow, Tensorflow and Spark to form a cross-system orchestrated ML pipeline into a standard set of well integrated tools which data scientists at Avast can adopt.
They will use Angler, an internal machine learning project for detecting phishing URLs to demonstrate how MATS stack was leveraged for this ML Pipeline, walking the audience through all stages of the Angler pipeline: data transformations and enrichments in Spark, training of models, experiment tracking and serving of the models. The pipeline is useful for fast and reproducible experiments and it allows a fast progression from research to production.
Speakers: Joao Da Silva and Yury Kasimov