Distributed TensorFlow on Spark: Scaling Google’s Deep Learning Library

We took Google’s recently open sourced deep learning library called “TensorFlow”, which is single-machine, and built a distributed implementation on Spark. We did this by using the abstraction layer called “Distributed DataFrame”, or DDF. TensorFlow started as an internal research project at Google and has since spread to be used across the organization. Google open sourced the project in Nov. 2015, but kept the distributed version of it closed. There are other deep learning frameworks that have been implemented on Spark, but TensorFlow’s API is elegant and powerful and has proven to work at Google scale. Putting TensorFlow on Spark makes it easier for companies to harness the power of deep learning in their own businesses without sacrificing scalability.

(1) Google’s TensorFlow release is a single-machine implementation
(2) We have built a distributed implementation in Spark which allows TensorFlow to scale horizontally.

Learn more:

  • TensorFlow
  • Deep Learning with Apache Spark and TensorFlow
  • TensorFlow On Spark: Scalable TensorFlow Learning on Spark Clusters
  • TensorFrames: Deep Learning with TensorFlow on Apache Spark

    « back