Brad Kaiser

Senior Software Engineer, IBM

Brad is a member of the Spark Technology Center at IBM. Before that he was a data engineer at The Weather Company where he built data pipelines using Spark, Cassandra, Hadoop, and Parquet. These have opened up terabytes of data to TWC’s data scientists and business analysts.

SESSIONS

Supporting Highly Multitenant Spark Notebook Workloads: Best Practices and Useful Patches

Notebooks: they enable our users, but they can cripple our clusters. Let's fix that. Notebooks have soared in popularity at companies world-wide because they provide an easy, user-friendly way of accessing the cluster-computing power of Spark. But the more users you have hitting a cluster, the harder it is to manage the cluster resources as big, long-running jobs start to starve out small, short-running jobs. While you could have users spin up EMR-style clusters, this reduces the ability to take advantage of the collaborative nature of notebooks. It also quickly becomes expensive as clusters sit idle for long periods of time waiting on single users. What we want is fair, efficient resource utilization on a large single cluster for a large number of users. In this talk we'll discuss dynamic allocation and the best practices for configuring the current version of Spark as-is to help solve this problem. We'll also present new improvements we've made to address this use case. These include: decommissioning executors without losing cached data, proactively shutting down executors to prevent starvation, and improving the start times of new executors. Session hashtag: #EUdev8