Panagiotis Garefalakis is a Ph.D. candidate at Imperial College London, Department of Computing. He is affiliated with the Large-Scale Data & Systems (LSDS) group and his research interests lie within the broad area of systems including large-scale distributed systems, cluster resource management, and stream processing.
Apache Spark has enabled a vast assortment of users to express batch, streaming, and machine learning computations, using a mixture of programming paradigms and interfaces. Lately, we observe that different jobs are often implemented as part of the same application to share application logic, state, or to interact with each other. Examples include online machine learning, real-time data transformation and serving, low-latency event monitoring and reporting. Although the recent addition of Structured Streaming to Spark provides the programming interface to enable such unified applications over bounded and unbounded data, the underlying execution engine was not designed to efficiently support jobs with different requirements (i.e., latency vs. throughput) as part of the same runtime. It therefore becomes particularly challenging to schedule such jobs to efficiently utilize the cluster resources while respecting their requirements in terms of task response times. Scheduling policies such as FAIR could alleviate the problem by prioritizing critical tasks, but the challenge remains, as there is no way to guarantee no queuing delays. Even though preemption by task killing could minimize queuing, it would also require task resubmission and loss of progress, leading to wasted cluster resources. In this talk, we present Neptune, a new cooperative task execution model for Spark with fine-grained control over resources such as CPU time. Neptune utilizes Scala coroutines as a lightweight mechanism to suspend task execution with sub-millisecond latency and introduces new scheduling policies that respect diverse task requirements while efficiently sharing the same runtime. Users can directly use Neptune for their continuous applications as it supports all existing DataFrame, DataSet, and RDD operators. We present an implementation of the execution model as part of Spark 2.4.0 and describe the observed performance benefits from running a number of streaming and machine learning workloads on an Azure cluster.Gare