Rob Ferguson is Director of Engineering at Automatic Labs making a smart connected car device and experiences to make driving smarter and safer. Previously, he founded the Data Science team at Rdio, a streaming music service, and delivered several data products including automated stations. He’s passionate about data products and insights with over 10 years professional development experience as well as research at Johns Hopkins and McGill Universities.
Automatic is the most widely used connected-car device in the emerging Internet of Things. Our connected-car adapter exposes a huge amount of data previously hidden within the car’s computer. This includes hundreds of measurements per minute of driving, ranging from velocity and location to mass air flow and intake air temperature. We leverage the Spark eco-system to sanitize and process this data, and generate insights for individual drivers as well as answer broader questions around transportation planning. Drivers’ understanding of a car’s efficiency typically ends with the EPA rating in the car’s window at purchase. Using Spark we relate car data to other cars’ in the same class, make, model, etc. We run batch jobs in Spark to train a unique physical model of a car based off real-world mechanical data for each Automatic-connected vehicle. We also generate an ‘expected’ model, based off EPA drive cycle data, for each supported make, model and year, and compare these to detect inefficient vehicle operation or driving behavior amongst our users. E.g. an attached ski-rack, under-inflated tires or aggressive acceleration. Automatic also detects events like hard braking and acceleration, speeding etc., along with their occurring location, while a trip is in progress. We upload these events in real-time to a Spark-Streaming pipeline for a geographical clustering followed by logic to detect trends that can indicate road hazards and opportunities for traffic planning improvements. E.g. blind intersections, inefficient signal placement or timing, poor road conditions etc. This talk will highlight how we use Spark and Spark-Streaming for the aforementioned applications and some novel techniques for analyzing and visualizing real-time automotive data.