Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow - Databricks

Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow

Download Slides

This talk will take an two existings Spark ML pipeline (Frank The Unicorn, for predicting PR comments (Scala) – https://github.com/franktheunicorn/predict-pr-comments & Spark ML on Spark Errors (Python)) and explore the steps involved in migrating this into a combination of Spark and Tensorflow. Using the open source Kubeflow project (now with Spark support as of 0.5), we will create an two integrated end-to-end pipelines to explore the challenges involved & look at areas of improvement (e.g. Apache Arrow, etc.).

 

Try Databricks
See More Spark + AI Summit Europe 2019 Videos

« back
About Holden Karau

Independent

Holden is an Apache Spark committer and PMC member who focus on PySpark and Kubernetes support. She is the co-author of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Her current side project is working on a book to teach children distributed systems, http://www.distributedcomputing4kids.com/.