Streaming Random Forest Learning in Spark and StreamDM - Databricks

Streaming Random Forest Learning in Spark and StreamDM

Download Slides

We present how to build random forest models from streaming data. This is achieved by training, predicting and adapting the model in real-time with evolving data streams. The implementation is on the open source library StreamDM, built on top of Apache Spark.

Session hashtag: #SAISEco7

About Heitor Murilo Gomes

Heitor is currently a machine learning researcher at Télécom ParisTech and a visiting researcher at INESCTEC (LIAAD). His main research area is Machine Learning, specifically Evolving Data Streams, Concept Drift, Ensemble methods and Big Data Streams. Heitor contributes to both MOA and StreamDM open data stream mining projects.

About Albert Bifet

Albert Bifet is a Professor at LTCI, Telecom ParisTech, Head of the Data, Intelligence and Graphs (DIG) Group at Telecom ParisTech, and Scientific Collaborator at Ecole Polytechnique. His research focuses on Machine Learning for Data Streams, Big Data Machine Learning and Artificial Intelligence. Problems he investigates are motivated by large scale data, the Internet of Things (IoT), and Big Data Science. Albert also co-leads the open source projects MOA Massive On-line Analysis and Apache SAMOA Scalable Advanced Massive Online Analysis.