data brew logo

EPISODE 7

Interpretable Machine Learning

What does it mean for a model to be “interpretable”? Ameet Talwalkar shares his thoughts on IML (Interpretable Machine Learning), how it relates to data privacy and fairness, and his research in this field.

Ameet Talwalkar
Ameet Talwalkar is an assistant professor in the Machine Learning Department at CMU, and also co-founder and Chief Scientist at Determined AI. His interests are in the field of statistical machine learning. His current work is motivated by the goal of democratizing machine learning, with a focus on topics related to automation, fairness, interpretability, and federated learning. He led the initial development of the MLlib project in Apache Spark, is a co-author of the textbook’Foundations of Machine Learning’ (MIT Press), and created an award-winning edX MOOC on distributed machine learning. He also helped to create the MLSys conference, serving as the inaugural Program Chair in 2018, General Chair in 2019, and currently as President of the MLSys Board.

Video Transcript

The Beans, Pre-Brewing

Denny Lee (00:06):
Welcome to Data Brew by Databricks with Denny and Brooke. This series allows us to explore various topics in the data and AI community. Whether we’re talking about data engineering or data science, we will interview subject matter experts to dive deeper into these topics. And while we’re at it, we’ll be enjoying our morning brew. My name is Denny Lee, I’m a developer advocate here at Databricks and one of the co-hosts of Data Brew.

Brooke Wenig (00:31):
And hello everyone, my name is Brooke Wenig, the other co-host of Data Brew and machine learning practice lead at Databricks. Today I have the pleasure of introducing my advisor, Ameet Talwalkar, to Data Brew. Ameet is the Chief Scientist at Determined AI and assistant professor at CMU University. Welcome, Ameet.

Ameet Talwalkar (00:47):
Thanks, great to be here.

Brooke Wenig (00:48):
All right. So I know you have a very long history in the field of machine learning, but I want to rewind it a little bit. What got you into the field of machine learning?

Ameet Talwalkar (00:55):
Yeah, so it was a pretty long and windy and kind of random walk. I guess starting even with, at an early age I was always really excited by math and also biology, I would say. In college, I started as an econ major, I ended up stumbling into computer science. I didn’t really think I would do anything in computer science after college. I had a few different jobs right out of college, largely I was playing a lot of ultimate Frisbee. But at one point I stumbled into working in a neuroscience lab, and I both realized I was really interested in how the brain works and just understanding details of how the brain works but I also wanted to do something mathier. And I sort of then stumbled into grad school and discovered machine learning which was to some extent satisfying that criteria, much more mathy. Depending on who you talk to in machine learning, it’s to some extent motivated by the brain or not, but it is somehow trying to use data to use predictions and somehow related to intelligence, again, depending on who you talk to.

Ameet Talwalkar (02:01):
But I guess to summarize again, I’ve always been really excited by math. I discovered an interest in computer science before it was cool and then I also stumbled my way into machine learning again well before it was cool. And I started grad school in the mid 2000s, living in New York City, all my friends were working in finance. I loved what I was doing but nobody else seemed to be all that interested in what I was doing, which is a night and day difference between then and now, not just with me but with the field obviously. So that’s been interesting.

Brooke Wenig (02:34):
Yeah, I stumbled into the field of computer science once it was already pretty cool, this was back in 2012, and Ameet’s actually the main reason why I got into the field of machine learning. I was taking an online course, this was way back when Databricks had these online courses that they ran through edX on distributive machine learning in Spark. And I saw that Ameet was teaching the course when I was interning at Splunk trying to use Spark there and I reached out to Ameet and said, “Hey, can I take your class in the fall?” And he said, “Yeah, no problem.” So that was how I got into the field of machine learning was through Ameet, and then he ended up being my advisor.

Ameet Talwalkar (03:06):
Yeah, Brooke was great to work with and your Chinese was also, every day she would surprise me with something new. She’s fluent in a bunch of languages, she’s brilliant, she asks a lot of great questions. And I guess we’re going to talk a little bit more about interpretability later, but it’s kind of funny because a project that you worked on a little bit with us related to decision trees was a project that we then kind of morphed into our first work in interpretability back in 2017/2018. So kind of all comes full circle I would say, I little bit.

Brooke Wenig (03:38):
Well thank you for the kind words. I think this is actually a really nice transition into, what are you currently working on? What are your current research areas of interest and how have they shifted over time?

Ameet Talwalkar (03:47):
Yeah, so I think from the time I started in grad school, soon afterwards I spent a lot of my time interning at Google research in New York City and it just felt like there was a disconnect a little bit at that time for me at least between the field of machine learning and academics versus how people were starting to be interested in it in industry in a sense that it felt like the, maybe not the biggest problem, but a big, big problem and opportunity for machine learning back in 2008 I felt was kind of getting it out the door. It was very complicated, very mathy, it sort of seemed like you needed a PhD in computer science or statistics or math to have any hope of actually using these sorts of methods. But at that point, at least at places like Google, data was already becoming the most important thing to them. There seemed to be a lot of opportunity to take advantage of this data and use machine learning and other sorts of predictive tools and it feels like getting it out the door was the real problem.

Ameet Talwalkar (04:45):
And so I would say to answer your question of what I’m working on now or how has it changed, I’d say that starting as early as 2008/2009, my interest has been getting machine learning out the door, I also call that democratizing machine learning. But really coming up with principle tools that allow people to use machine learning either more easily or at this point maybe more safely. So that has been kind of the underlying motivation for my work for the last decade or so. It started, to me a big question or a big challenge, and this wasn’t just for me it was in the field, was one of the scale of data back in the late 2000s and early 2010s, and that’s what led me to be interested in distributed computing and Spark in particular.

Ameet Talwalkar (05:32):
So after my PhD, I went to Berkeley for a post doc and that was working in the AMPLab right when Spark was starting to take off. So I worked side by side with other folks working on Spark and also side by side with systems people, coming from a much more ML background. And it was just super interesting and it remains that way for me today to work with people from different backgrounds. I found working in machine learning with other machine learning researchers, we would all ask, and obviously they’re very, very smart people, but we kind of had all of the same context. So sometimes we wouldn’t ask the basic questions, we’d immediately go to the really technical questions. You talk to people who are a little further away, they would ask seemingly the really simple, basic questions, but in some way they’re the most piercing questions because they question underlying assumptions that other people in your very small field all for better or worse take for granted.

Ameet Talwalkar (06:27):
And anyways, going back to your broad question that the way my work has changed over time has been in terms of what I think are the most pressing problems or some of the more interesting or emerging problems in terms of getting machine learning out the door or democratizing it. And I’d say specifically, the three sets of problems I’ve worked on over time, first focus on scale, parallelism, scalability, and that’s working in the AMPLab, working on Spark, working with those folks has been super amazing and fun. The second stage for me, and obviously it wasn’t so disjointed, these things are all overlapping, the second set of things I worked on were related to automation and I think Liam came on on your podcast at some point and talked about some of the stuff that we worked on together there.

Ameet Talwalkar (07:13):
And more recently, and again I think scalability and automation are still really fundamental bottlenecks and super important obviously, but a lot of research and problems that I’ve been thinking about recently have been moving more towards this notion of safety. So now people are using machine learning more, due to things like Spark and TensorFlow and Pytorch and just the general society becoming more educated on machine learning and deep learning. So people are using it more and more, which is great, but it’s also kind of scary because people can be using it in the wrong way. It’s sort of terrifying in some sense. So people that are already using it today may not be using it safely, the people who are going to be wanting to use machine learning in the future are likely going to be potentially less and less experts in these different fields. It’s going to be undergrads who are AI majors from Berkeley or CMU, rather than PhD’s from those places. And we need to make these tools easier and safer for people to use, and I think interpretability is a really important part of the safety equation. So I don’t know if that answers your question or partly answers your question.

Denny Lee (08:22):
I think it definitely doesn’t, but I’m actually going to go do a bit of a retrospective back because you covered a lot. So first things first, the most important question, do you still play ultimate Frisbee?

Ameet Talwalkar (08:32):
I throw the Frisbee a little bit but I’ve torn my ACL more than once. Brooke knows, I like biking more and part of the reason I like biking a lot is because it’s much nicer on your knees. I miss that a little bit in Pittsburgh because the hills aren’t quite the same as what they are in California, the weather’s also a little bit different. But I play a little bit of ultimate but much less than I used to for knee reasons.

Denny Lee (08:54):
Okay, so do what I do which is bike so that you can play ultimate, that’s the key thing here. Back to the real questions, my apologies, but I had to segue off. You start off talking about exactly your three points and the evolution of your history from scale automation to safety. So let’s start a little bit on the scale automation part for those who may not have actually heard Liam’s podcast yet. How did working with machine learning within the context of a distributive computing environment, like the Spark project, allow you to scale? Like what were you trying to address at that time? Because for a lot of folks, they might not actually have that historical context.

Ameet Talwalkar (09:35):
Yeah, back in the day and I think this is still true, the broad thing you’re trying to do with machine learning is to learn from the data that you have and learn underlying patterns in your data. And the general rule of thumb is that more data is better, the more data you have, the stronger the signals in your data are going to be, maybe the more nuanced the signals and the patterns that you’re potentially able to learn. So on one end, data is the underlying currency, it’s a thing that potentially will allow you to do what you want to do. But the problem is that classical machine learning methods and statistical methods were really proposed and studied largely from a statistical and learning theory perspective. These methods were developed at a time when we didn’t have that much data, we had 100 points or 1000 data points and everything could fit on your laptop running in MATLAB or R. And so the real focus was on statistical problems, and those again have not gone away, those are super interesting obviously and super important. But there was a real lack of people thinking about how to take, let’s say something like an SVM, kernel methods were king in the 2000s.

Ameet Talwalkar (10:53):
So you take something like a kernel method, which was motivated for a bunch of reasons, and now all of a sudden let’s say you want to train your SVM instead of on 100 points or 1000 points, on 20 million points. And this was a problem that I worked on before studying at Berkeley when I was at Google or NYU and Google. And even storing a kernel matrix that’s 20 million by 20 million, the details here don’t matter, but storing a dense 20 million by 20 million matrix is just a super hard thing to do from a storage point of view, on one machine let alone on a bunch of machines. And so that’s just one example. But even for large linear models or random forest or decision trees or whatever sort of models that you’re thinking, on one hand there’s this tension between wanting more data to train better models and the underlying computational and storage concerns associated with actually trying to run efficient algorithms to actually train these models. And so something like Spark or distributed computing is, as we all know, really powerful and nice in terms of it allows you to get access to much more storage and computation, but it also leads to new complications in terms of how do you have these different parallel workers communicating with each other.

Ameet Talwalkar (12:06):
And of course, Spark provides a really nice API and programming model to allow you to do this, but you still need to come up with the underlying algorithms that work under the MapReduce or the Spark paradigm. And especially in the 2010s once I started at Berkeley, that was a lot of what I was thinking about, while also learning a lot about distributed computing and about Spark and so on. So I in theory was working on distributed machine learning during grad school, but I was doing it with just other ML people and theory people and I knew next to nothing about systems. So I was thinking of very simple naïve divide and conquer methods, implementing everything in MatLab. Moving to Berkeley, working with systems people, allowed me to think about this in a more robust sort of way, which is super interesting for me.

Denny Lee (12:55):
I completely get you. In my past, that’s exactly what I did, basically could I shove everything into a gigantic MatLab server and hope the heck it would actually not blow up in the process. So actually I wanted to segue a little bit more to our actual topic about interpretability, but you did mention that’s sort of important which is about how to work with models more safely. So can you provide maybe some examples of machine learning models that are not safe? Like how we’re not using them safely?

Ameet Talwalkar (13:28):
Yeah, I feel like we hear horror stories. And to be clear, I think there’s more questions right now than there are answers in this are which is why I think it’s an interesting set of research problems, but I think there’s probably a lot of examples. Two really natural ones I think are one, related to privacy. So our data’s being collected by a bunch of different organizations. In some cases we know about it, in some cases we don’t, but obviously there is some private information probably in some of the data that is being collected on us. And there’s a question to what extent these models, A, are using our data and B, if they are using our data to what extent is it revealing private information about ourselves? And so there’s a lot of growing work in the field of machine learning on privacy. Federated learning is an example of this, of keeping your data local so that to some extent it’s motivated by mitigating privacy. Differential privacy has become a topic people think a fair bit about, but one way not to be safe is to have models that are trained on some data, that data somehow revealing sensitive information about the people or the organizations or whatever sensitive information it’s revealing, whatever sensitive information is in the underlying data. So, that’s one.

Ameet Talwalkar (14:38):
And the other obvious thing that people talk about is fairness or lack thereof. So at one point there was this thought that machine learning could be this beautiful solution to get rid of human bias because machine learning isn’t biased itself, it’s just learning patterns. And that’s a nice idea, but what machine learning is really doing is learning patterns in data that you’ve collected, and in many cases amplifying biases that already exist in your underlying data itself. And we all know that the way our data is collected is often in a very biased fashion, and there’s huge numbers of examples of this. So the point is that using machine learning can really give you a false sense of being unbiased, when in fact what it’s often doing is propagating existing biases in data and doing it in a way that’s, people call it, there’s an analogy, there’s a cute phrase for it that I can’t remember, but it’s like equivalent to money laundering or data laundering or bias laundering, that’s what people call it. So not only is there bias in machine learning models, but you can imagine it’s doing it in subtle ways, at this point I people know that this is major problem. But I think those are two real safety concerns.

Ameet Talwalkar (15:54):
You could potentially argue that there’s other safety concerns related to just deployment in terms of is this draining the battery in my car or my cell phone or something like that? But generally when I think about safety I’m thinking about largely privacy and fairness concerns.

Brooke Wenig (16:11):
So why do you think there’s so much emphasis on the field of privacy and fairness now compared to like five years ago? Is it because there’s more use cases, more people working in machine learning? Just want to get your thoughts as to why there’s more emphasis now.

Ameet Talwalkar (16:22):
I think it’s the first thing you said. People are actually using machine learning, it’s actually influencing our lives a lot. And if you look at the trend of 10 years ago, 10 years ago nobody cared about machine learning other than machine learning researchers. Now, it’s probably over-hyped and people are talking about it too much, but there’s a reason people are so excited about it. We’ve seen a bunch of transformation applications powered by machine learning and there’s a real thought that it’s just the tip of the iceberg. So for every application or every organization that’s using machine learning today, you expect there to be orders of magnitude tomorrow. And so, if we’re already seeing bias and fairness and privacy problems manifesting today, those are just going to get worse and worse over time as people are predicting more and more people are using machine learning.

Brooke Wenig (17:15):
Got it. Well I think this is an excellent segue into the paper you recently published titled, Towards Connecting Use Cases and Methods in Interpretable Machine Learning. Because you talk a lot about this disconnect between use cases and methods and machine learning interpretability. Could you walk us through some of the key tenets that you address in that paper?

Ameet Talwalkar (17:31):
Yeah, sure. And I should start by saying the credit goes to my students here, so Valarie, Jeff, Greg and June, they did a lot of the hard work here, all of the hard work here. And the story behind this paper was largely just us thinking about what we think are the important next steps and the current problems in the field of interpretability. And I think a lot of what we were saying is, a lot of is not new and a lot of it is hopefully things that people already know, but hopefully it’s packaged in a way that is accessible for people who are new in the field and it’s a modern take on problems that people have already known about. But the core idea behind the paper is just that ultimately interpretability, the field of interpretable machine learning is meant to be creating a set of methods that people can use in practice. But there’s really a huge disconnect right now between the researchers who are studying this problem and coming up with new methods and practitioners who are using machine learning and saying that they want to understand their models.

Ameet Talwalkar (18:37):
And this disconnect is really I think a real bottleneck in the field right now. And I think an underlying reason why there is this huge disconnect is that the idea of interpretability is kind of inherently squishy. Like what does it mean for something to be interpretable? It’s not an obvious thing to define. And we had a reading group back at CMU maybe three years ago now about this, where we were just reading about at that time what the main papers were in interpretability, the main works and different ideas. One week we had, one of the people in that reading group was a PhD in English, which I was surprised about, but he was able to keep up. But we asked him to present one week and he presented on various ideas in psychology perspective on interpretability. And sure enough, I’m no expert in psychology, but my vague recollection of what we talked about in that reading group was that even in the field of psychology they’ve thought about what interpretability means from just a psychology human perspective and unfortunately there’s no one clear answer. So if psychologists had been thinking about formally trying to define interpretability for decades and haven’t converged on yet one answer, it’s unlikely for we in the machine learning community or the computer science community to solve that problem in a matter of a few years.

Ameet Talwalkar (19:58):
All right, so the fundamental problem is that there are researchers developing new methods for what they think the problems are in interpretable ML, and then there’s practitioners who are using, say, neural network and trying to understand what these models are doing. And these two communities want to be working with each other but are not, and they really could help each other but they’re not. And what that means is that people are, and again people aren’t doing this to be malicious or anything, but a lot of research in the interpretable ML community is focused on problems that may or may not actually be problems that anyone cares about. So people are coming up with abstract problems saying that this is an interpretability question, but there’s no real motivating application for it, it’s really hard to even evaluate whether their methods are even working. And so they write this nice paper, their cool ideas, maybe there’s nice visuals of different neurons in a neural network lighting up or whatever, but that’s never going to translate to somebody actually using this stuff in practice because we don’t actually understand what practitioners want. Similarly, these practitioners they want to understand their models but they not know enough about ML to be able to formulate quantitative mathematically precise questions that researchers can actually answer themselves.

Ameet Talwalkar (21:15):
And so the point of this paper was really to further highlight this problem, and again, other people have talked about this before, but sadly this I think still remains a really, really big bottleneck in the field. And a related bottleneck is, how do you evaluate the quality of any new interpretability method that you’re coming up with? Any evaluation needs to be somehow based on a real application or motivated in some way as a proxy for performance in a real application. And if we don’t know what these end to end pipelines look like and how a method’s actually going to be used in practice, it’s just really hard to even know what problem to be solving.

Ameet Talwalkar (21:52):
And so the point of the paper wasn’t just to be negative, it was to say, “Hey, this is a problem, this is what we do know.” And it was kind of proposing a taxonomy, organizing the information we do know about different types of methods, and focusing on what I would argue is the “easiest” interpretability problem for practitioners, which is model debugging, which is still not an easy problem at all, but it’s relatively easier than having a doctor trust your machine learning model. That’s orders of magnitude even harder, I don’t know how to solve that problem. But I think model debugging is a problem we can hope to try to solve with data scientists working in collaboration with ML researchers to ground the research in real problems and to end to end evaluations ending in actual applications. Anyway, so that was a very long summary of what we talked about, hopefully that makes some sense, but I’ll stop.

Brooke Wenig (22:44):
Yeah, that definitely makes sense. And so in the paper there are three key tenets, problem definition, method selection and method evaluation. The one that resonated most with me was the problem definition because I see this with all of our customers whenever we’re trying to scope engagements of what is it you’re actually trying to solve? They’ll often say, “I want the model with the best accuracy.” I’m like, “All right, are you trying to optimize for accuracy? Do you care more about false positives, false negatives?” There’s often a lot of disconnect between the business side and the machine learning engineers or the data scientists. So this paper definitely resonated with me from that perspective.

Ameet Talwalkar (23:16):
Yeah, and I think our proposal in the paper is very similar to what you said and there’s no easy answer there, you have to talk to each other. There’s no way me as a researcher I can just say, “This is the interpretability problem people care about.” I need to talk to people in the field who actually have these problems, try to understand their applications and their problems well enough such that I can then translate that or abstract that problem into something maybe more formal and mathematical and then try to solve it. And a big argument that we make in the paper is that that sort of process needs to be happening over and over again. So researchers need to be reaching across the aisle to talk to practitioners and vice versa and we need to be solving these problems together over and over again such that we get more abstract problems that are actually grounded in reality. And I think that’ll be fun and interesting to do and we’re starting to do that in my group as well to learn and figure out what are the most interesting problems to solve. But I think there’s a lot left to do there.

Denny Lee (24:14):
Right, I was actually hoping you would provide an answer that said, “Well now I don’t have to talk to anybody because we’re doing all of this on computers.” But I guess that’s not the case.

Ameet Talwalkar (24:22):
Yeah, that was sort of a concern I had starting in interpretability was related, when we started working in interpretability the way people evaluated their results were user studies. And I was really nervous about running user studies because it just felt like that was not an area of expertise for me, I didn’t know if I trusted the results so much. I don’t think user studies are the be all end all, user studies in the sense of using Mechanical Turkers to evaluate the quality, I think you actually need something deeper with user studies which is you actually work with practitioners to understand their problems, which is actually more time consuming than running a Mechanical Turk experiment, but I think potentially way more rewarding as well.

Denny Lee (25:03):
No it makes sense, in some ways it almost segues or reminds me of the idea of the theoretical statistician versus the applied statistician, the fact that you actually needed both in order to be able to, one that could still do the mathematics behind everything while one that actually started figuring out, “Well, how is it actually being applied?”, and how to actually connect with everybody that’s around.

Ameet Talwalkar (25:23):
Yeah, and it’s not clear to me that this is going to need to happen forever. The analogy that I make is that we can abstract problems in machine learning like clustering or binary classification or regression because we know that there are enough practical applications that roughly fit those APIs such that you then can have practitioners and researchers maybe be a little bit more removed from each other. Hyperparameter tuning’s another example of that. But I don’t think we’ve established those APIs yet in interpretability. I don’t think that there’s an infinite number of problems and every interpretability problem is a bespoke problem. But I think we need to figure out what are the K different canonical interpretability problems and how to I formalize those mathematically? How do I evaluate them the right way? What are the canonical examples of real applications for each of those to make this more of a formal and vigorous field. And I think that will happen, it kind of needs to happen but we’re not really there yet.

Denny Lee (26:23):
No, that’s fair. And I think going back to your definition like there’s no one clear answer for interpretability. So then I’m wondering if it’s related to privacy and fairness, that’s why there’s this need for massive transparency, so you can actually figure out what the heck’s going on, just your perspective, yeah.

Ameet Talwalkar (26:46):
I guess we haven’t bridged those ideas yet. If what we care about is fairness and privacy, maybe where does interpretability even come in? And I think that the argument is that if we know these models are going to be flawed in some way, we need to be able to poke around and inspect them and evaluate them and check that they’re doing reasonable things. And that I think is where interpretability comes in. Can I use interpretability to evaluate whether something is biased and maybe then fix that bias? Or similarly privacy or something like that. So I think if we knew apriori that our models were perfect, then maybe we wouldn’t care what they were doing and we wouldn’t need to inspect them, we wouldn’t need them to be transparent in any way. But obviously that’s not where we are now and probably will never be where we are. And so we need to understand what they’re doing and we need to be able to evaluate them.

Brooke Wenig (27:34):
So in terms of understanding models like neural networks, I know LIME, SHAP, Integrated Gradients are very popular, what are some of the pros of these approaches and what are some of their shortcomings and we need to still be looking forward to solutions in the research field?

Ameet Talwalkar (27:47):
Yeah, great question. I think that all of those methods are really interesting. They’re in some sense simple, but I think in a good way. In retrospect, yeah they’re the natural starting points for what you might want to do. I think the real disconnect, there’s nothing wrong with their methods, but they’re all solving different fundamental mathematical problems. And they’re all well specified problems, but in the same way that if fundamentally what I care about, if I make the analogy to different types of machine learning problems, if I want to solve a binary classification problem versus a clustering problem, I might use a different method. Right now, people don’t even realize that LIME, SHAP and Integrated Gradients are solving different problems. And so what you often see is that people just say, “Okay, my problem is an interpretability problem.” SHAP has really great open source software, so Scott Lundberg, the first author of that work, he and his colleagues did a lot of work to make those methods accessible to people. And so people use them but they sometimes use them the right way but they often though sometimes use them the wrong way. It’s not an issue with the method, every method has strengths and weaknesses or every method was intended for certain things but also not intended for other things.

Ameet Talwalkar (29:01):
And so I think having people understand when these methods are applicable and not applicable is kind of the issue with all of them, and it’s true with any new method that’s going to be proposed as well. It’s not really an issue with any one method as much as people understanding the taxonomy of these methods and knowing how to evaluate each of them for their specific problem to know whether it’s the right method or the wrong method.

Denny Lee (29:24):
Got it. So then, I actually want to bring it back to your paper actually. And the real quick call out to add that I noticed is that in the work of the paper itself is that you synthesize a foundational work on IML methods for evaluation into an actionable taxonomy. So I’m going to ask the question from more of a naïve perspective because that’s where I’m coming from, which is, could you explain that? Like what is the context in terms of now these new methods for evaluation and the taxonomy that goes with it?

Ameet Talwalkar (29:58):
Yeah, so there’s where the taxonomy is today and where we want a taxonomy to ultimately exist. And so in an ideal world, a taxonomy would exist where you can either start from the top or the bottom. So if you start from the top you’re a user, let’s say you care about model debugging. So as a user you might first say, “Why do I care about interpretability? Am I developer of models and do I want to debug the model or do I care about trust or do I care about something else?” And if you say, “All right, I care about model debugging”, then there’s questions about details about the specific problem that you care about, in terms of not all model debugging problems are the same. There’s maybe different types of model debugging problems. So one part of the taxonomy, and this is the part that we know less about, is going from broadly saying, “I care about interpretability”, so actually specifying a specific, well specified, technical mathematical problem that needs to be solved. And so that’s similar to saying, “I want to use machine learning”, to, “I want to solve a binary classification problem.” So that’s one part of a taxonomy, that’s the part that kind of doesn’t exist yet today.

Ameet Talwalkar (31:10):
The other end of the spectrum is if you’re a machine learning researcher, you start with your particular method that you solved. Your method is solving some technical objective which is in someway motivated by broadly interpretability, but then methods from different technical objectives and it’s not really a one to one thing, but methods solving different underlying objectives can be used to solve one or more well specified use cases. Say, a well specified model debugging use case, or so on. And so the dream would be that we work through enough of these end to end use cases together, researchers and practitioners, such that over time we can build on these individual use cases to flesh out this taxonomy such that in the future people can say, “Ah, I want to come up with a new method. I think this particular type of use case could use better methods.” And now I’m in it, but this use case is well defined in an abstract sense along with data sets and problems to do evaluation for that use case. So I as a researcher really don’t have to now go end to end anymore, I can just use benchmarking data sets, benchmarking evaluation metrics, and just solve the technical interpretability sub-problem associated with that use case.

Ameet Talwalkar (32:28):
That’s where we want to go. Where we are right now is, some pieces of that taxonomy are better understood than others. And I would also say that my group and I writing this paper, we’re much more coming from the, “I am a researcher”, point of view. So we understand the bottom part of that taxonomy as well, and I think that a big contribution of this paper was at least providing down our take on how the taxonomy looks in terms of different types of methods and different technical objectives that they’re solving. We know less about all the different practical use cases in interpretability, that’s what we’re trying to educate ourselves on right now by working with other people, but we also think that this kind of shared vocabulary talking about this problem generally will hopefully allow not just us but other people to start flushing out the clear gaps that currently exist in it.

Brooke Wenig (33:17):
So, speaking of the field of machine learning research, what device do you have for people that want to get into this field? I know that you did a PhD in the more traditional route, but what advice do you have for other people that might currently be in industry to get more involved with machine learning research and interpretable machine learning?

Ameet Talwalkar (33:33):
Yeah, I think it’s a great time to be in this field whether you’re in academia or industry or whatever. ML is just increasingly pervasive, increasingly important practically, I think it’s only going to get more and more important over time. So it’s a great time to get in, it’s very exciting. I think that what I would recommend anyone to do is to learn a little bit of math. Right now I tell people that I get more theoretical every year as a researcher and it’s not necessarily because my research is changing but it’s because when I started grad school, machine learning was to a large extent a field of applied math. And my interests were more at the intersection of theory meets application or theory meets practice when the field was very, very theoretical. The field is shifting more and more to being an applied, coming up with systems, using it in practice, which is great. But that doesn’t mean that the math isn’t important. So if I had one piece of advice, no matter what you’re doing, don’t be scared by math and learn the underlying math. That’s the underlying statistics, linear algebra and optimization maybe and calculus that you need. And more and more, all of this stuff is, there’s amazing tutorials online, whether it’s the form of MOOCs or whatever else, blogs. So this material’s all available somewhere for people to read, I would highly recommend people learning math. So, that’s one.

Ameet Talwalkar (34:57):
Two, I would say don’t be distracted by the hype. So machine learning is increasingly important. By no means do I think that everything that’s going on now is just a bubble or hype by any means. That said, there is a lot of hype in this general field and I would say that people who are interested in working in this area should be working on interesting problems that are potentially solving important high impact areas, not just what’s the flavor of the day today because it’s hyped up by whoever is hyping it up. So maybe those would be my two big pieces of advice. Well three pieces of advice, do it because machine learning is great. While you’re doing it, learn math. And three, don’t be distracted by the hype but instead pick interesting problems even if they don’t seem to be the coolest problems today. Because what’s en vogue today might not be en vogue tomorrow and things are pretty cyclical.

Brooke Wenig (35:51):
That’s some super helpful advice. I still remember in grad school having to go back and relearn all of the linear algebra, stats and calculus that I learned in undergrad because I was like, “When do I ever care about taking the second derivative of something?” It’s like, “Oh yeah, if it’s convex function I’m going to want to do that all the time.” I definitely like that theme of learn math and don’t be distracted by the hype. As someone who works in industry, there are so many advancements coming out both from industry and academia. I’m like, “All right, what’s the latest flavor of the day”, as you’d said, “of object detection or image classification?” But I think the overall theme of, “Solve an interesting problem with high impact”, is the best advice that you can give.

Ameet Talwalkar (36:29):
Right, and it’s often the case and it’s not at all to say that the latest research coming out isn’t industry or couldn’t potentially be high impact, but it’s almost always the case that when you want to solve a problem, you should start with simpler methods. And only when those simpler methods don’t work do you try to use more and more fancy things.

Brooke Wenig (36:46):
Definitely. Well, I want to thank you again for joining us today on Data Brew and sharing all of your expertise about interpretable machine learning and advice for getting people into the field of machine learning research.

Ameet Talwalkar (36:55):
Great, yeah. It was really great to get to chat with you guys, and thanks for having me.