Prajakta Kalmegh received her Ph.D. degree in Computer Science from Duke University in May 2019. Her dissertation work focused on analyzing resource contentions in data analytics cluster frameworks, and using these insights to make dynamic and contention-aware scheduling decisions for an online heterogeneous workload. Before joining Duke, she received her M.S. degree in Computer Science from Georgia Institute of Technology in May 2010. She has an extensive industry and research experience with esteemed companies like IBM Research Labs, Microsoft Corporation, SAP Labs, and Persistent Systems. She currently works as a Principal Engineer at Unravel Data.
November 18, 2020 04:00 PM PT
John submits a query and expects it to run smoothly. Based on his prior experience, he anticipates the query to finish in 20 mins.
Scenario-1: John’s query finishes execution in the expected timeframe and doesn’t impact any other concurrent query in the workload.
Scenario-2: John’s query takes twice the expected time, and also slows down multiple other concurrent queries. John now wonders “should I have submitted this query?”.
At Unravel, we have implemented a Dash-app, called qSteer, that proactively alerts Spark users, like John, about a possible slowdown that their query can cause or face if submitted to the cluster. Additionally, it auto-suggests the users a few future time slots that they can choose to submit the query alternatively. Query plan details fetched from SparkSQL, and execution performance metrics from Spark core have enabled us to generate query quality predictors. This data combined with historical query executions and current cluster state helps us derive a set of likely outcomes “if” the user’s query is submitted. qSteer uses ML techniques to flag a schedule “unacceptable” based on prior experiences, and alerts the user of possible delays from submitting a query.
During the talk, we will discuss the architecture, our core algorithm for performing predictive analytics, and demo our app to showcase how users can use it with ease to decide whether to submit their query or not at any time. We will share our experiences of using qSteer at scale at one of our customers - specifically, the challenges we faced, scenarios where it worked/not worked, and some learnings along the way.
Speakers: Prajakta Kalmegh and Yusaku Sako