Is your job running slower than usual? Do you want to make sense from the thousands of Hadoop & Spark metrics? Do you want to monitor the performance of your flow, get alerts and auto tune them? These are the common questions every Hadoop user asks but there is not a single solution that addresses it. We at Linkedin faced lots of such issues and have built a simple self serve tool for the hadoop users called Dr. Elephant. Dr. Elephant, which is already open sourced, is a performance monitoring and tuning tool for Hadoop and Spark. It tries to improve the developer productivity and cluster efficiency by making it easier to tune jobs. Since its open source, it has been adopted by multiple organizations and followed with a lot of interest in the Hadoop and Spark community. In this talk, we will discuss about Dr. Elephant and outline our efforts to expand the scope of Dr. Elephant to be a comprehensive monitoring, debugging and tuning tool for Hadoop and Spark applications. We will talk about how Dr. Elephant performs exception analysis, give clear and specific suggestions on tuning, tracking metrics and monitoring their historical trends. Open Source: https://github.com/linkedin/dr-elephant
Session hashtag: #EUdev9
Akshay Rai is an engineer at Linkedin working with the Grid team. He is the lead engineer for the popular Dr. Elephant project, open sourced by Linkedin. He has been working on operational intelligence solutions for Hadoop and Spark and trying to improve the developer productivity by building systems that enable monitoring, visualizing and debugging of Big Data applications, Hadoop clusters and other related tools in real time.