Qingbo Hu

Machine Learning Engineer, Intuit

Qingbo Hu is a machine learning engineer from Intuit AI Engineering team and the acting tech lead for Intuit’s Model Monitoring Service project. Before joining Intuit, he worked as a data scientist in LinkedIn. Qingbo earned his Ph.D from University of Illinois at Chicago, majoring in computer science.

UPCOMING SESSIONS

PAST SESSIONS

How Intuit uses Apache Spark to Monitor In-Production Machine Learning Models at Large-ScaleSummit 2020

The presentation introduces Intuit AI Model Monitoring Service (MMS). MMS is an in-house Spark-based solution developed by Intuit AI to provide ongoing monitoring for both data (statistics of model input/output etc.) and model metrics (precision, recall, AUC etc.) of in-production ML models. The project is soon to be open-source. MMS aims to tackle multiple challenges of in-production ML model monitoring:

  1. Integration of multiple data sources from different time ranges: in order to generate all metrics to monitor an in-production model, we often need to integrate multiple datasets with different schema from different time range. For example, in order to compute model metrics like AUC, the collected ground truth is always collected in a different data set with a few days or even months delay after we record the model's output data. In other cases, we might need to integrate additional dimensional data so that we can create different segments to analyze the model per segment.
  2. Reusable and extendable metric and segmentation library: it is not scalable to develop a metric/segmentation logic per model. How to create a reusable yet extendable library to hold the metric and segmentation logic is a challenging task by considering different models might have distinct data schema. Model owners are able to take advantage of MMS to create and schedule pipelines without writing any code to monitor in-production models. MMS is able to integrate generic data and also provides a programming API to be fit into a specific data schema generated by a certain ML platform. MMS also allows developers to use MMS' APIs to create reusable metric and segmentation logic in an open-contribution library. MMS pipelines are very scalable and Intuit is using MMS to integrate 10M+ rows and 1K+ columns of in-production data to generate 10K+ metrics for in-production models.'

Multi-Label Graph Analysis and Computations Using GraphXSummit 2017

In real-life applications, we often deal with situations where analysis needs to be conducted on graphs where the nodes and edges are associated with multiple labels. For example, in a graph that represents user activities in social networks, the labels associated with nodes may indicate their membership in communities (e.g. group, school, company, etc.), and the labels associated with edges may denote types of activities (e.g. comment, like, share, etc.). The current GraphX library in Spark does not directly support efficient calculation on the label-defined subgraph analysis and computations. In this session, the speakers will propose a general API library that is able to support analysis on multi-label graphs, and can be reused and extended to design more complicated algorithms. It includes a method to create multi-label graphs and calculate basic statistics and metrics at both the global and subgraph level. Common graph algorithms, such as PageRank, can also be efficiently implemented in a parallel scheme by reusing the module/algorithm in GraphX, such as Pregel API. See how LinkedIn is able to leverage this tool to efficiently find top LinkedIn feed influencers in different communities and by different actions. can be reused and extended to design more complicated algorithms. It includes a method to create multi-label graphs and calculate basic statistics and metrics at both the global and subgraph level. Common graph algorithms, such as PageRank, can also be efficiently implemented in a parallel scheme by reusing the module/algorithm in GraphX, such as Pregel API. See how LinkedIn is able to leverage this tool to efficiently find top LinkedIn feed influencers in different communities and by different actions. Session hashtag: #SFml3