Several Spark applications, e.g., stream processing and machine learning, require low latency execution. Currently, Spark uses a BSP computation model, and notifies the scheduler at the end of each task. Invoking the scheduler at the end of each task adds overheads and results in decreased throughput and increased latency. We observe for that for many of these low-latency workloads, the same operations are executed repeatedly, e.g., processing different batches in streaming, or iterative model training. Based on this observation, we find that we can improve performance by amortizing the number of times the scheduler is invoked. In this talk, we present an alternate execution model for Spark, where we optimistically schedule several stages of a DAG in advance. Following this executors can directly communicate with each other and make progress without any scheduler involvement. Executors also asynchronously inform the scheduler for task completion, this is used for progress reporting and fault tolerance. We present an implementation of this execution model on Spark 2.0 and describe performance benefits observed from running a number of Spark Streaming and MLLib workloads.
Aurojit Panda is a fifth year PhD student in Computer Science, advised by Scott Shenker. He works in the NetSys Lab and is also affiliated with the AMPLab. His research looks at how to provide resilience guarantees for practical distributed systems both by statically verifying properties and using lightweight runtime mechanisms to enforce properties. Previously, from 2008-2011, he was at Microsoft, working on the kernel for a systems incubation project. Before that he received a ScB in mathematics and computer science from Brown University in 2008.