I am a Software Engineer and Data Scientist with 14 years of experience. I have published papers in top peer reviewed conferences and have been granted patents. As part of my current job, I manage a team of data scientists and engineers developing core ML services at Adobe. Our services are used by various Adobe Sensei Services that are part of Experience Cloud. I hold a Masters and Bachelor degree in Computer Science from a leading universities in India.
Many high-tech industries rely on machine-learning systems in production environments to automatically classify and respond to vast amounts of incoming data. Despite their critical roles, these systems are often not actively monitored. When a problem first arises, it may go unnoticed for some time. Once it is noticed, investigating its underlying cause is a time-consuming, manual process. Wouldn't it be great if the model's output were automatically monitored? If they could be visualized, sliced by different dimensions? If the system could automatically detect performance degradation and trigger alerts? In this presentation, we describe our experience from building such a core machine-learning services: Model Evaluation.
Our service provides automated, continuous evaluation of the performance of a deployed model over commonly-used metrics like the area-under-the-curve (AUC), root-mean-square-error (RMSE) etc. In addition, summary statistics about the model's output, their distributions are also computed. The service also provides a dashboard to visualize the performance metrics, summary statistics and distributions of a model over time along with REST APIs to retrieve these metrics programmatically.
These metrics can be sliced by input features (e.g. Geography, Product type) to provide insights into model performance over different segments. The talk will describe various components that are required in building such a service and metrics of interest. Our system has a backend component built with spark on Azure Databricks. The backend can scale to analyze TBs of data to generate model evaluation metrics.
We will talk about how we modified Spark MLLib for computing AUC sliced by different dimensions and other optimizations in Spark to improve compute and performance. Our front-end and middle-tier, built with Docker and Azure Webapp provides visuals and REST APIs to retrieve the above metrics. This talk will cover various aspects of building, deploying and using the above system.
Have you ever wondered how an ML model works? Why does it come up with certain predictions and not others? Have you seen a model behaving in ways that are weird or counter-intuitive? Do you lack trust in your model because it is a black-box? Recently, the rise in popularity of deep-learning neural-net models-impenetrable as they are even to their creators--has underscored the importance of mathematical frameworks for model interpretability. However, even simple models such as linear models can be hard to interpret for those without sufficient technical expertise.
This talk will survey various approaches to model interpretability in both academia and industry. We will showcase global and local (instance-level) insights, using a particular model as an example. Global interpretability is valuable in providing a summary level understanding of the model behavior. However, the complex nature of the model makes it in-accurate at an instance-level. Hence, we augment it with instance-level interpretations.
We will talk about how we built and deployed in production an algorithm that can interpret blackbox models at global and local-level. Our system has a backend component built with Spark on Azure Databricks. The backend can scale to analyze millions of data-points to generate explanations. We will talk about Locality Sensitive Hashing (LSH) and other optimizations in Spark to improve compute and performance. The proposed method is far more efficient as compared to prior-art which is compute intensive. Our front-end and middle-tier, built with Docker and Azure Webapp provides visuals and REST APIs to retrieve the model interpretations. This talk will cover various aspects of building, deploying and using the above system.