Tim Chen - Databricks

Tim Chen

CTO, Hyperpilot

Timothy Chen is the CTO of Hyperpilot, and also a PMC/committer on Apache Drill and Apache Mesos. Before joining Hyperpilot, Timothy was the lead engineer at Mesosphere working on container runtime and Spark on Mesos.


Spark on Mesos – A Deep Dive

While Spark and Mesos emerged together from the AMPLab at Berkeley, Mesos is now one of several clustering options for Spark, along with Hadoop YARN, which is growing in popularity, and Spark’s “standalone” mode. This talk describes in detail the integration between Spark and Mesos to support clustering of Spark jobs, including the sequence of events that occurs during the life cycle of a typical Spark job. We’ll discuss recommendations for optimizing performance and resource utilization, and to avoid known limitations. We’ll also discuss possible future work for Spark on Mesos. Along the way, we’ll understand the abstractions that Spark exposes for clustering, in general. We’ll also compare and contrast Spark on Mesos vs. Spark Standalone mode and Spark on YARN. We’ll offer suggestions for when to choose one option vs. the others.

Spark On Mesos: The State Of The Art

Last year we presented the current state of the art for Spark on Mesos, including their origins together at UC Berkeley AMPlab, features such as dynamic allocations, major users of Mesos, such as Apple Siri, and the details of job invocation and resource allocation. This year we'll recap features in the Spark and Mesos integration, emphasizing new features and development in Mesos that can improve Spark deployment and scheduling, such as Quotas and GPU isolation. We'll finish with a few stories from real-world deployments and show a demo of the Spark on Mesos integration in the context of the maturing SMACK/Infinity stack, integrating Spark, Mesos, Akka, Cassandra, and Kafka (plus other tools) for streaming applications.

Apache Spark on Kubernetes

Kubernetes is a fast growing open-source platform which provides container-centric infrastructure. Conceived by Google in 2014, and leveraging over a decade of experience running containers at scale internally, it is one of the fastest moving projects on GitHub with 1000+ contributors and 40,000+ commits. Kubernetes has first class support on Google Cloud Platform, Amazon Web Services, and Microsoft Azure. Unlike YARN, Kubernetes started as a general purpose orchestration framework with a focus on serving jobs. Support for long-running, data intensive batch workloads required some careful design decisions. Engineers across several organizations have been working on Kubernetes support as a cluster scheduler backend within Spark. During this process, we encountered several challenges in translating Spark considerations into idiomatic Kubernetes constructs. In this talk, we describe the challenges and the ways in which we solved them. This talk will be technical and is aimed at people who are looking to run Spark effectively on their clusters. The talk assumes basic familiarity with cluster orchestration and containers. Session hashtag: #SFeco9

Additional Reading:
  • Declarative Infrastructure with the Jsonnet Templating Language