In Spark 2.0, we introduced Structured Streaming, which allows users to continually and incrementally update your view of the world as new data arrives, while still using the same familiar Spark SQL abstractions. I talk about progress we’ve made since then on robustness, latency, expressiveness and observability, using examples of production end-to-end continuous applications.
Michael Armbrust is committer and PMC member of Apache Spark and the original creator of Spark SQL. He currently leads the team at Databricks that designed and built Structured Streaming and Databricks Delta. He received his PhD from UC Berkeley in 2013, and was advised by Michael Franklin, David Patterson, and Armando Fox. His thesis focused on building systems that allow developers to rapidly build scalable interactive applications, and specifically defined the notion of scale independence. His interests broadly include distributed systems, large-scale structured storage and query optimization.