Thomas Graves - Databricks

Thomas Graves

Principal Systems Software Engineer, NVIDIA

Thomas Graves is a distributed systems software engineer at NVIDIA, where he concentrates on accelerating Spark. He is a committer and PMC on Apache Spark and Apache Hadoop. Previously worked for Yahoo on the Big Data Platform team working on Apache Spark, Hadoop, YARN, Storm, and Kafka.

UPCOMING SESSIONS

PAST SESSIONS

Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS Library-continuesSummit Europe 2019

GPU acceleration has been at the heart of scientific computing and artificial intelligence for many years now. GPUs provide the computational power needed for the most demanding applications such as Deep Neural Networks, nuclear or weather simulation. Since the launch of RAPIDS in mid-2018, this vast computational resource has become available for Data Science workloads too. The RAPIDS toolkit, which is now available on the Databricks Unified Analytics Platform, is a GPU-accelerated drop-in replacement for utilities such as Pandas/NumPy/ScikitLearn/XGboost.

Through its use of Dask wrappers the platform allows for true, large scale computation with minimal, if any, code changes. The goal of this talk is to discuss RAPIDS, its functionality, architecture as well as the way it integrates with Spark providing on many occasions several orders of magnitude acceleration versus its CPU-only counterparts.

Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS LibrarySummit Europe 2019

GPU acceleration has been at the heart of scientific computing and artificial intelligence for many years now. GPUs provide the computational power needed for the most demanding applications such as Deep Neural Networks, nuclear or weather simulation. Since the launch of RAPIDS in mid-2018, this vast computational resource has become available for Data Science workloads too. The RAPIDS toolkit, which is now available on the Databricks Unified Analytics Platform, is a GPU-accelerated drop-in replacement for utilities such as Pandas/NumPy/ScikitLearn/XGboost. Through its use of Dask wrappers the platform allows for true, large scale computation with minimal, if any, code changes.

The goal of this talk is to discuss RAPIDS, its functionality, architecture as well as the way it integrates with Spark providing on many occasions several orders of magnitude acceleration versus its CPU-only counterparts.