Drizzle is a low latency execution engine for Apache Sparkthat is targeted at stream processing and iterative workloads.
Currently, Spark uses a BSP computation model, and notifies the
scheduler at the end of each task. Invoking the scheduler at the end
of each task adds overheads and results in decreased throughput and
increased latency. In Drizzle, we introduce group scheduling, where
multiple batches (or a group) of computation are scheduled at once.
This helps decouple the granularity of task execution from scheduling
and amortize the costs of task serialization and launch. Our
experiments on a 128 node EC2 cluster show that Drizzle can achieve
end-to-end streaming latencies of less than 100ms and can get up to
3.5x lower latency than Spark Streaming. Compared to Apache Flink, a
record-at-a-time streaming system, we show that Drizzle can recover
around 4x faster from failures and that Drizzle has up to 13x lower
latency during recovery.