Extending Apache Spark APIs Without Going Near Spark Source or a Compiler

The more time you spend developing within a framework such as Apache Spark, you learn there are additional features that would be helpful to have given the context and details of your specific use case. Spark supports a very concise and readable coding style using functional programming paradigms. Wouldn’t it be awesome to add your own functions into the mix using the same style? Well you can!

In this session, you will learn about using Scala’s “Enrich my library” programming pattern to add new functionality to Spark’s APIs. We will dive into a how-to guide with code snippets and present an example where this strategy was used to develop a validation framework for Spark Datasets in a production pipeline. Come learn how to enrich your Spark!

Session hashtag: #DevSAIS19

« back
About Anna Holschuh

Anna Holschuh is a Lead Data Engineer for Target HQ in the Enterprise Data Analytics and Business Intelligence team. She has combined her love of all things Target with building scalable, high-throughput systems with an emphasis on Machine Learning. At Target, Anna is currently building Spark production pipelines that help bring the best mix of products to Target guests all over the country. She completed her S.B. and M.Eng in EECS at MIT, with a focus in Machine learning for her graduate work. Anna hails from the Twin Cities in Minnesota.