Livy is an open source Apache licensed REST web service for managing long running Spark Contexts and submitting Spark jobs. It is a joint development effort by Cloudera and Microsoft. Livy solves a fundamental architectural problem that plagued previous attempts to build a Rest based Spark Server: instead of running the Spark Contexts in the Server itself, Livy manages Contexts running on the cluster managed by a Resource Manager like YARN. This enables proper fault-tolerance, high availability, session isolation, scalability and security. Livy also provides multiple modes of interaction: REST based jar submission, a thin java client for fine grained job submission and result retrieval, as well as submission of code snippets in string form. Thus Livy enables interactive Applications as well as interactive Notebooks like Jupyter, to leverage a remote Spark cluster. In fact, Livy already powers a Spark backend for Jupyter notebooks on HDIInsight Service on Microsoft Azure, which we will demo during our talk. In addition to the demo, in our talk we will describe Livy’s API, architecture and future roadmap.
Anand Iyer is a senior product manager at Cloudera. His primary areas of focus are platforms for real-time streaming, apache spark, and tools for data ingestion into hadoop. Before joining Cloudera, he worked as an engineer at LinkedIn, where he applied machine learning techniques to improve the relevance and personalization of LinkedIn's Feed. Anand has extensive experience in leveraging big data platforms to deliver products that delight customers. He has a master's in computer science from Stanford and a bachelor's from the University of Arizona.
Pravin Mittal is a Principal Development Manager in the HD Insight group at Microsoft, owning Spark and Hbase Service. Over the past 15 years, he has worked as developer/manager for the Database kernel and storage, SQL Azure VM Service, In-memory Hekaton and SQL Performance teams.