At Shopify, we underwrite credit card transactions, exposing us to the risk of losing money. We need to respond to risky events as they happen, and a traditional ETL pipeline just isn’t fast enough. Spark Streaming is an incredibly powerful realtime data processing framework based on Apache Spark. It allows you to process realtime streams like Apache Kafka using Python with incredibly simplicity.
Nick has applied his Statistics education to epidemiology, survey collection, and more recently, data science. He works for Shopify, and spends his days writing PySpark jobs. He also leads the development of their realtime risk management software, which recently switched to using Spark Streaming. Nick hails from Northern Ontario, Canada, and is one of those crazy few who love cold weather.