Deep Learning with Apache Spark and GPUs

Download Slides

Apache Spark is a powerful, scalable real-time data analytics engine that is fast becoming the de facto hub for data science and big data. However, in parallel, GPU clusters are fast becoming the default way to quickly develop and train deep learning models. As data science teams and data savvy companies mature, they will need to invest in both platforms if they intend to leverage both big data and artificial intelligence for competitive advantage.
This session will cover:
– How to leverage Spark and TensorFlow for hyperparameter tuning and for deploying trained models
– DeepLearning4J, CaffeOnSpark, IBM’s SystemML and Intel’s BigDL
– Sidecar GPU cluster architecture and Spark-GPU data reading patterns
– The pros, cons and performance characteristics of various approaches

You’ll leave the session better informed about the available architectures for Spark and deep learning, and Spark with and without GPUs for deep learning. You’ll also learn about the pros and cons of deep learning software frameworks for various use cases, and discover a practical, applied methodology and technical examples for tackling big data deep learning.

Session hashtag: #SFds14

About Pierce Spitler

Pierce leads product data science at Bitfusion, the world's first end-to-end deep learning and AI development and infrastructure management platform. Previously, he served as the Director of Data Science and Insights for eyeQ, next-generation personalized retail displays that leverage deep learning for facial recognition. He has several years experience interpreting sensor data, working with massive data sets, and performing deep learning on image and video data. Pierce is the co-organizer of the Austin Deep Learning Meetup and writes and speaks on deep learning and applied data science.