Apache Spark has enabled a vast assortment of users to express batch, streaming, and machine learning computations, using a mixture of programming paradigms and interfaces. Lately, we observe that different jobs are often implemented as part of the same application to share application logic, state, or to interact with each other. Examples include online machine learning, real-time data transformation and serving, low-latency event monitoring and reporting. Although the recent addition of Structured Streaming to Spark provides the programming interface to enable such unified applications over bounded and unbounded data, the underlying execution engine was not designed to efficiently support jobs with different requirements (i.e., latency vs. throughput) as part of the same runtime.
It therefore becomes particularly challenging to schedule such jobs to efficiently utilize the cluster resources while respecting their requirements in terms of task response times. Scheduling policies such as FAIR could alleviate the problem by prioritizing critical tasks, but the challenge remains, as there is no way to guarantee no queuing delays. Even though preemption by task killing could minimize queuing, it would also require task resubmission and loss of progress, leading to wasted cluster resources. In this talk, we present Neptune, a new cooperative task execution model for Spark with fine-grained control over resources such as CPU time.
Neptune utilizes Scala coroutines as a lightweight mechanism to suspend task execution with sub-millisecond latency and introduces new scheduling policies that respect diverse task requirements while efficiently sharing the same runtime. Users can directly use Neptune for their continuous applications as it supports all existing DataFrame, DataSet, and RDD operators. We present an implementation of the execution model as part of Spark 2.4.0 and describe the observed performance benefits from running a number of streaming and machine learning workloads on an Azure cluster.Gare
Panagiotis Garefalakis is a Ph.D. candidate at Imperial College London, Department of Computing. He is affiliated with the Large-Scale Data & Systems (LSDS) group and his research interests lie within the broad area of systems including large-scale distributed systems, cluster resource management, and stream processing.
Konstantinos Karanasos is a Principal Scientist Manager at Microsoft’s Gray Systems Lab (GSL), Azure Data’s applied research group. He is the manager of the Bay Area branch of GSL and tech lead for systems-for-ML efforts in the group. Konstantinos’ work at Microsoft previously focused on resource management for the company’s production analytics clusters. This work is deployed to over 300k machines across Microsoft and was key in operating the world’s largest YARN clusters. He has also contributed part of his work at Microsoft to open source projects: he is a committer and member of the Project Management Committee (PMC) of Apache Hadoop, and a contributor to ONNX Runtime. Prior to Microsoft, he was a postdoctoral researcher at IBM Almaden Research Center, working on query optimization for large-scale data platforms. Konstantinos holds a PhD from Inria, France, and a Diploma in Electrical and Computer Engineering from the National Technical University of Athens, Greece.