Structured streaming plays an important role in current data infrastructure. In response to tremendous streaming requirements, we have actively worked on developing structured streaming in Spark in the past few months. In this talk, Kristine Guo and Liang-Chi Hsieh will detail some of the issues that arose when applying structured streaming and what was done to address them. Specifically, they will cover:
Finally, they will detail how these features can help to compute aggregates over dynamic batches with minimum size requirements and perform stream-stream joins, while supporting high RPS and throughput.
Kristine is a software engineer at Apple focused on cloud platform technologies. She currently works on developing high scale backend systems. Prior to joining Apple, Kristine obtained her Bachelor's and Master's degrees in Computer Science from Stanford University.
Liang-Chi Hsieh is an Apache Spark Committer and an open source and big data engineer at Apple. Most of his contributions to Apache Spark are in SQL, MLlib modules. He recently works on Structured Streaming. Prior to joining Apple, Liang-Chi worked on internal Spark platform at Uber. He holds a Ph.D. degree in Computer Science from National Taiwan University.