Yusaku Sako

Head of Data Science, Unravel Data

Yusaku Sako is Head of Data Science at Unravel Data, where he combines his passion for data science, machine learning, and big data to deliver high-value insights to Unravel’s customers to help them get the most out of their big data investments. Prior to joining Unravel, he was Director of Engineering at Hortonworks where he led the development of Apache Ambari for which he served as the Project Chair (VP Apache Ambari).

Past sessions

John submits a query and expects it to run smoothly. Based on his prior experience, he anticipates the query to finish in 20 mins.
Scenario-1: John’s query finishes execution in the expected timeframe and doesn’t impact any other concurrent query in the workload.
Scenario-2: John’s query takes twice the expected time, and also slows down multiple other concurrent queries. John now wonders “should I have submitted this query?”.

At Unravel, we have implemented a Dash-app, called qSteer, that proactively alerts Spark users, like John, about a possible slowdown that their query can cause or face if submitted to the cluster. Additionally, it auto-suggests the users a few future time slots that they can choose to submit the query alternatively. Query plan details fetched from SparkSQL, and execution performance metrics from Spark core have enabled us to generate query quality predictors. This data combined with historical query executions and current cluster state helps us derive a set of likely outcomes “if” the user’s query is submitted. qSteer uses ML techniques to flag a schedule “unacceptable” based on prior experiences, and alerts the user of possible delays from submitting a query.

During the talk, we will discuss the architecture, our core algorithm for performing predictive analytics, and demo our app to showcase how users can use it with ease to decide whether to submit their query or not at any time. We will share our experiences of using qSteer at scale at one of our customers - specifically, the challenges we faced, scenarios where it worked/not worked, and some learnings along the way.

Speakers: Prajakta Kalmegh and Yusaku Sako