Ben Sidhom - Databricks

Ben Sidhom

Software Engineer, Google

Ben is a software engineer at Google. He works on Cloud Dataproc, focusing on the scaling experience. Prior to that he worked on embedded machine intelligence technologies that have launched in features such as GBoard suggestion and smart select in recent Android releases. Prior to Google, Ben was at eBay on the shipping analytics team, working on shipping estimate models.

UPCOMING SESSIONS

PAST SESSIONS

Improving Apache Spark Downscaling—continuesSummit Europe 2019

As more workloads move to severless-like environments, the importance of properly handling downscaling increases. While recomputing the entire RDD makes sense for dealing with machine failure, if your nodes are more being removed frequently, you can end up in a seemingly loop-like scenario, where you scale down and need to recompute the expensive part of your computation, scale back up, and then need to scale back down again.

Even if you aren't in a serverless-like environment, preemptable or spot instances can encounter similar issues with large decreases in workers, potentially triggering large recomputes.

In this talk, we explore approaches for improving the scale-down experience on open source cluster managers, such as Yarn and Kubernetes-everything from how to schedule jobs to location of blocks and their impact (shuffle and otherwise).

Improving Apache Spark DownscalingSummit Europe 2019

As more workloads move to severless-like environments, the importance of properly handling downscaling increases. While recomputing the entire RDD makes sense for dealing with machine failure, if your nodes are more being removed frequently, you can end up in a seemingly loop-like scenario, where you scale down and need to recompute the expensive part of your computation, scale back up, and then need to scale back down again.

Even if you aren't in a serverless-like environment, preemptable or spot instances can encounter similar issues with large decreases in workers, potentially triggering large recomputes. In this talk, we explore approaches for improving the scale-down experience on open source cluster managers, such as Yarn and Kubernetes-everything from how to schedule jobs to location of blocks and their impact (shuffle and otherwise).