Leverage Mesos for running Spark Streaming production jobs

Download Slides

Mesos is a general purpose cluster manager that can scale to tens of thousands of nodes and that can handle mixed data loads and general applications. Mesos is being used in large deployments such as Twitter and AirBnB. Its versatility makes it particularly appealing to organizations that have a mixed workload and want to maximize their cluster utilization (are there any other?). But how exactly does it work when the workload is a long-running Spark Streaming job? Particularly, how does one deal with failures that are bound to happen at this scale, without data loss and service disruptions? In this talk we’ll discuss how Spark integrates with Mesos, the differences between client and cluster deployments, and compare and contrast Mesos with Yarn and standalone mode. Then we’ll look at deploying a Spark Streaming application that should run 24/7 and show how to deploy, configure and tune a Mesos cluster such that: – the application runs efficiently and uses only the resources it needs – if any of the nodes fails (including the driver), the application recovers without data loss

« back
About Iulian Dragos

Iulian has been working on Scala since 2004 when he joined Martin Odersky's research lab at École Polytechnique Fédérale de Lausanne (EPFL). He wrote the JVM backend, the bytecode optimizer and worked on various other parts of the compiler. He earned his PhD degree at EPFL in 2010. He joined Typesafe from day 1, where he worked on development tools for Scala and later took the lead of the Spark team at Typesafe, with a focus on making Reactive Applications with Spark a reality.

About Luc Bourlier

Luc has been working on the JVM since 2002, first for IBM on the Eclipse project, in the Debugger team, where he wrote the expression evaluation engine. After a few other Eclipse projects, he went to TomTom, to recreate their data distribution platform for over-the-air services. He joined Typesafe in 2011 to work on the Eclipse plugin for Scala. Luc then switched to the Fast Data team, with a focus on deployment and interaction with other frameworks.