Ramya Raghavendra is a research scientist in IBM Research working on Distributed Cognitive Systems research. She received her MS (2009) and PhD (2010) in Computer Science from University of California, Santa Barbara where she was awarded the Outstanding Graduate Student award, Grace Hopper fellowship, and ACM Student Research award. Currently, her work involves building systems, runtimes, and algorithmic support for Big Data and Analytics, with a focus on applying machine-learning techniques to solve enterprise problems. Ramya is the recipient of IBM Outstanding Technical Achievement Award(2017), Master Inventor award (2015), has published over 20 peer-reviewed publications and over 24 patents.
As common sense would suggest, weather has a definite impact on traffic. But how much? And under what circumstances? Can we improve traffic (congestion) prediction given weather data? Predictive traffic is envisioned to significantly impact how driver’s plan their day by alerting users before they travel, find the best times to travel, and over time, learn from new IoT data such as road conditions, incidents, etc. This talk will cover the traffic prediction work conducted jointly by IBM and the traffic data provider. As a part of this work, we conducted a case study over five large metropolitans in the US, 2.58 billion traffic records and 262 million weather records, to quantify the boost in accuracy of traffic prediction using weather data. We will provide an overview of our lambda architecture with Apache Spark being used to build prediction models with weather and traffic data, and Spark Streaming used to score the model and provide real-time traffic predictions. This talk will also cover a suite of extensions to Spark to analyze geospatial and temporal patterns in traffic and weather data, as well as the suite of machine learning algorithms that were used with Spark framework. Initial results of this work were presented at the National Association of Broadcasters meeting in Las Vegas in April 2017, and there is work to scale the system to provide predictions in over a 100 cities. Audience will learn about our experience scaling using Spark in offline and streaming mode, building statistical and deep-learning pipelines with Spark, and techniques to work with geospatial and time-series data. Session hashtag: #EUent7