Life occurs in real-time, and not surprisingly, more solutions are being built using streaming technologies. Event-based architectures are becoming the norm, and customers are expecting immediate access to their data. This new world offers many exciting opportunities, but also some new challenges. What do you do when your streaming data is not complete? What if it relies on another data source? Does the dependent data exist yet, and does it come from a 3rd party? How do we merge a complete picture of data when data is sourcing from multiple places at the same time? A new norm in the world of distributed services.
Join us as we dive deep into the technical details around these scenarios and more. Expect to learn about stream-stream joins, enriching stream data using local or remote data, and ways to anticipate and correct errors within the stream. Leave with a better understanding of managing data dependencies within a Spark Structured Streaming application.
Aaron is a Sr. Director of Data Engineering who has built multiple data lakes and Data Warehouses for major companies across the financial space. He has spent the last couple of years working passionately inside evolving technologies such as Lakehouse, Data mesh, and scalable data system. Aaron holds a master’s degree in information technology from University of Wisconsin.
Kevin Mellott is a Team Lead and Spark Data Developer at FIS, working to improve the data processing pipeline of one of the world's largest FinTech companies. Although he began as a traditional Software Engineer, his career ventured into the world of data science several years ago. Recent projects have included the use of Spark's machine learning pipelines as well as the Python Natural Language Toolkit. When he isn't wrestling with Big Data, Kevin can be found at the local movie theater or ice hockey rink.