Running R at Scale with Apache Arrow on Spark - Databricks

Running R at Scale with Apache Arrow on Spark

Download Slides

In this talk you will learn how to easily configure Apache Arrow with R on Apache Spark, which will allow you to gain speed improvements and expand the scope of your data science workflows; for instance, by enabling data to be efficiently transferred between your local environment and Apache Spark. This talk will present use cases for running R at scale on Apache Spark. It will also introduce the Apache Arrow project and recent developments that enable running R with Apache Arrow on Apache Spark to significantly improve performance and efficiency. We will end this talk by discussing performance and recent development in this space.

« back
About Javier Luraschi

Javier is the author of "Mastering Spark with R", sparklyr, mlflow and many other R packages for deep learning and data science. He holds a double degree in Math and Software Engineer and decades of industry experience with a focus on data analysis. He currently works in RStudio and previously in Microsoft Research and SAP.