In this talk you will learn how to easily configure Apache Arrow with R on Apache Spark, which will allow you to gain speed improvements and expand the scope of your data science workflows; for instance, by enabling data to be efficiently transferred between your local environment and Apache Spark. This talk will present use cases for running R at scale on Apache Spark. It will also introduce the Apache Arrow project and recent developments that enable running R with Apache Arrow on Apache Spark to significantly improve performance and efficiency. We will end this talk by discussing performance and recent development in this space.
Javier holds a double degree in Math and Software Engineer and decades of industry experience with a focus on data analysis. He currently works in RStudio and previously in Microsoft Research and SAP.