Apache Spark for Machine Learning and Data Science

This course focuses on teaching distributed machine learning with Spark. Students will build and evaluate pipelines with MLlib, understand the differences between single node and distributed ML (and why you may get different results), and optimize hyperparameter tuning at scale. This class is taught concurrently in Python and Scala.

