Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark - Databricks

Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark

Download Slides

The opportunity in accelerating Spark by improving its network data transfer facilities has been under much debate in the last few years. RDMA (remote direct memory access) is a network acceleration technology that is very prominent in the HPC (high-performance computing) world, but has not yet made its way to mainstream Apache Spark. Proper implementation of RDMA in network-oriented applications can improve scalability, throughput, latency and CPU utilization. In this talk we are going to present a new RDMA solution for Apache Spark that shows amazing improvements in multiple Spark use cases. The solution is under development in our labs, and is going to be released to the public as an open-source plug-in.
Session hashtag: #EUres3



« back
About Yuval Degani

Yuval has recently joined LinkedIn’s Data Infrastructure team as a Staff Software Engineer, where he is focused on scaling and developing new features for Hadoop and Spark. Before that, Yuval was a Senior Manager of Engineering at Mellanox Technologies, leading a team working on introducing new network acceleration technologies to Big Data and Machine Learning frameworks. Prior to his work in the Big Data and AI fields, Yuval was a developer, an architect, and later a team leader in the areas of low-level kernel development for cutting-edge high-performance network devices. Yuval holds a BSc in Computer Science from the Technion Institute of Technology, Israel.