Lee Yang is a Senior Principal Engineer at Verizon/Oath (formerly Yahoo), working on large-scale systems and machine learning platforms.
TensorFlowOnSpark (TFoS) was open sourced in Q1 2017, and it has gained strong adoption within the Spark community for running TensorFlow training and inferencing jobs on Spark clusters. At Spark Summit 2017, we explained how TFoS enables Python applications to conduct distributed TensorFlow training and inference efficiently by leveraging key built-in capabilities of PySpark and TensorFlow. In this talk, we cover the major enhancements of TFoS in recent months. We will introduce a new Scala API for users who want to integrate previously trained models into an existing Scala/Spark workflow. We will describe a new Python API for Spark ML pipelines to train all types of TensorFlow models, and conduct inference/featurization without any custom code. Additionally, we will cover the support for TensorFlow Keras API, and TensorFlow Datasets. Session hashtag: #DLSAIS16
In recent releases, TensorFlow has been enhanced for distributed learning and HDFS access. Outside of the Google cloud, however, users still needed a dedicated cluster for TensorFlow applications. There are several community projects wiring TensorFlow onto Apache Spark clusters. While these approaches are a step in the right direction, they are limited to support synchronous distributed learning only, and don’t allow TensorFlow servers to communicate with each other directly. This session will introduce a new framework, TensorFlowOnSpark, for scalable TensorFlow learning, which will be open sourced in Q1 2017. This new framework enables easy experimentation for algorithm designs, and supports scalable training and inferencing on Spark clusters. It supports all TensorFlow functionalities, including synchronous & asynchronous learning, model & data parallelism and TensorBoard. It provides architectural flexibility for data ingestion to TensorFlow (pushing vs. pulling) and network protocols (gRPC and RDMA) for server-to-server communication. Its Python API makes the integration with existing Spark libraries like MLlib easy. The speakers will walk through multiple examples to outline these key capabilities, and share benchmark results about scalability. Learn how, with a few lines of code changes, an existing TensorFlow algorithm can be transformed into a scalable application. You'll also be given tangible takeaways on how deep learning could be easily conducted on cloud or on-premise with a new framework. Session hashtag: #SFdev9 Learn more: