data brew logo

EPISODE 2

Data Ethics

Have you ever wondered how your purchasing behavior may reveal protected attributes? Or how data scientists and business play a role in combating bias? We discuss with Diana Pfeil recommendations to reduce bias and improve fairness, from SHAP to adversarial debiasing.

Diana Pfeil
Diana Pfeil is a leader in machine learning and optimization with over 15 years of experience building data products. Currently, Diana serves as the head of research and development at Pex, the global trusted leader in digital rights technology. Prior to Pex, she was a staff data scientist at Honey (acquired by PayPal), where she focused on e-commerce, lending, and responsible AI efforts. She was also CTO at travel tech startup Lumo, an engineer at Amazon working on early recommendation and personalization systems, and adjunct professor teaching predictive modeling at CU Denver. She holds a PhD in Operations Research from MIT. She has a passion for the data community, and has been heavily involved in the machine learning community as an event organizer and speaker.

Video Transcript

The Beans, Pre-Brewing

Denny Lee (00:05):
Welcome to Data Brew by Databricks with Denny and Brooke. The series allows us to explore various topics in the data and AI community. Whether we’re talking about data engineering or data science, we interview subject matter experts to dive deeper into these topics. And while we’re at it, we’ll be enjoying our Morning Brew. My name is Denny Lee. I’m a developer advocate at Databricks, and one of the co-hosts of Data Brew.

Brooke Wenig (00:32):
Hello, everyone. My name is Brooke Wenig. I’m the other co-host of Data Brew and machine learning practice lead at Databricks. And today I have the pleasure of introducing Diana Pfeil, who is a staff data scientist at PayPal, and also co-organizer of the Women in Machine Learning & Data Science Chapter in Boulder, Colorado. Diana, welcome.

Diana Pfeil (00:49):
Hello, thanks for having me.

Brooke Wenig (00:51):
So, Diana, can we start off with a little bit about how you got into the field of machine learning?

Diana Pfeil (00:56):
Sure. I feel like my path was pretty traditional in that I studied math and computer science in college and my first job, I was planning to get a PhD, but I wanted to see if I would enjoy doing software work first, because that’s what all my peers were doing, even though that’s not what I was interested in in college, but I ended up luckily at Amazon in the personalization group. And at that time we were doing a lot of collaborative filtering and really machine learning. And this was in 2005. So before machine learning was very popular or really I hadn’t even heard about it that much before, even though I was obsessed with math and I just loved it. We had so much data, it was so fun to come up with those algorithms. And I knew that I would be doing machine learning forever for my career.

Brooke Wenig (01:45):
That’s awesome that you found your passion so early on, I also know that you’re very passionate about the field of data ethics. Could you talk a little bit about how you got into that field?

Diana Pfeil (01:53):
I am. Data ethics, I think I didn’t really start thinking about until a little bit later. So, I did end up going to graduate school and probably there and afterwards I was thinking a lot more about what is the impact of my work? What kind of problems am I really interested in? And I think on a personal level, it was also related a little to my background. I was born in Slovakia, which was a communist country and my family escaped and we were refugees for a while. We lived in Austria and then Canada eventually. And so, I always am really compelled by this question of how come some people just by circumstance like where they’re born or something about them that they have no control over, why does that result in completely different experiences in life or really different outcomes?

Diana Pfeil (02:44):
So, I think that’s always been compelling. So as more people have talked about fairness and data ethics, I was always have stayed on top of it. And now, with Honey and the acquisition with PayPal, I also worked in financial services. And that’s where there’s regulation, where you do have to look at fair lending and consider fairness when doing machine learning. And so that’s been really fun because it’s actually regulated that we have to make sure that our algorithms are fair. And so that’s been really cool to be at the forefront of that professionally.

Brooke Wenig (03:16):
Could you talk a little bit more about fair lending? What are some things that you’re allowed to include or exclude like a ZIP code, a feature that you’re allowed to include? What are some things that regulation prevents you from providing as input to these models?

Diana Pfeil (03:29):
So you absolutely cannot provide as input to models anything that’s a protected attribute. So, someone’s gender, you can’t input of course, race, anything like that. I think where it’s tricky is there’s a lot of the input, pretty much most inputs about someone are correlated to some of these protected attributes. Like if I shop at Sephora and an algorithm knows that, they might be more likely to know I’m a woman or maybe the type of browser I use could be correlated with some protected attributes about me. So, that’s where you have to do some analysis, some further analysis to ensure that the attributes you’re using are not only there in a way that proxies for those protected attributes, but are there to actually help the decision in some way, in some justifiable way.

Denny Lee (04:24):
So this is really interesting. So then since a lot of these attributes can almost identify the various things that you don’t want to basically identify. What are the attributes that you could then use to ensure that this is done fairly?

Diana Pfeil (04:46):
Well, I think this is what’s tricky because as we all know, credit scores are historically what is used to determine credit in the United States and things like income are a factor in credit, and we all know that income encodes a lot of what is happening in society. So there’s a lot of racial and differences in income across racial groups or across genders or across lots of different protected groups. But income is known… If you have a higher income then you are a safer credit risk. So, you have to be able to show that you can use income because there’s a business justification that actually correlates. And then as much as possible, you want to be using things that are actually telling you something of importance for the domain. And can I mention something else that’s interesting?

Denny Lee (05:41):
Oh, absolutely. Continue on. Yeah, absolutely. We’re not going to stop you.

Diana Pfeil (05:45):
Have you all heard of adversarial debiasing?

Denny Lee (05:48):
Oh no, no, please. For the audience, absolutely explain it. No, you’re better than me.

Diana Pfeil (05:54):
So as we all know, with machine learning models, you usually have some objective that you’re optimizing towards, right? And it’s some metric that’s usually representative of accuracy in some sense, but what if you want to represent something like fairness and accuracy? You have now two objectives. So now you’re in this space where you have to trade-off between them, but there’s this really cool approach called adversarial debiasing, where what you’re doing is… It’s kind of like a GAN, where you have an adversary and your adversary model is trying to say, “Oh, this model, this prediction, or this data input is male or female.” And my algorithm is trying to confuse that adversary. So what you end up doing is you’re able to maximize your metric like accuracy or whatever you care about, but you’re also able to maximize the confusion of that adversary.

Diana Pfeil (06:51):
So you’re able to kind of hide any protected attributes and almost look for all the space of all of your best decisions, but the ones that are most fair in the sense that they don’t represent the protected group. And I guess what I want to say is there’s methods like that, that aren’t standard but I think over time as the industry cares more and looks more around creating fair algorithms or unbiased algorithms where techniques like that will just be used and it’ll be a better outcome for everyone, we won’t really be necessarily compromising for accuracy, but we’ll be choosing a more fair solution.

Brooke Wenig (07:27):
That was actually going to be my follow-up question, because I know GANs are super popular, but not many people are actually using them to solve business problems. They’re super fun in artwork in generating new images, but in terms of adversarial debiasing, how long do you think it’s going to take for the field to adopt this as a standard best practice for data scientists?

Diana Pfeil (07:45):
I do know that in the lending space, there are businesses that are already looking at using this. And if financial services and other regulated industries where there’s regulation saying, “Your algorithm must be fair,” those are going to be the first adopters, and so I think it’s already starting to happen. As to when it’s going to be a standard, well, first I think we need standards around governance or responsible AI in companies. And maybe that’ll be the next step after that. So it’s like after it’s become standard to have a responsible AI framework that everyone follows.

Brooke Wenig (08:24):
Got it. Kind of like the GDPR for AI rather than just for the data itself.

Diana Pfeil (08:28):
Yeah. And that by the way, is starting to happen. Europe is probably going to do more regulation of algorithms and outcomes soon and California I’m sure will follow. And so I think the space is really shifting and it’ll become part of the mainstream to think about these issues as a data scientist design your ML models.

Denny Lee (08:49):
I’m just curious, this is going a little off-script, but you’ve reminded me of something, which is… Do you feel the approach to things like, for example, differential privacy, do they still have to come into play or do you think these adversarial debiasing approaches are actually better? It’s not trying to choose one or the other, so I apologize the way I asked that question, but I’m just curious from your context, because the confusion that you’ve reminded me of adversarial debiasing it’s pretty much that’s the context of differential privacy, which is meaning as you ask more questions, you add more noise to confuse the person querying the data. So I’m just curious for your context here.

Diana Pfeil (09:28):
I feel like it’s definitely an and question. Well, I think privacy is a concern when it comes to data ethics too. It’s actually the concern that has been most adopted by regulators and people, which is, “Hey, why are these people using my data that I should have control over to make decisions about me? What’s up with that?” I think that is standard and part of data, not standard, but it’s part of data ethics and things like… There’s approaches that you can use to protect people when it comes to privacy. And really these questions around fairness are just another dimension that it’s important to think about. I really don’t think it’s either or.

Denny Lee (10:07):
Yeah. Makes a ton of sense. So then this naturally segues to my next question. You’ve already discussing a little bit about how data scientists and businesses, because you’re doing it yourself for combating bias, but I’m just curious, why do you think there’s such a focus on the tech industry in AI when bias really does exist in all these other fields too? So it’s not just “us.”

Diana Pfeil (10:26):
Well, I guess my question is back at you. You think that other industries are completely off the hook, or what are the examples that you have in mind of other industries that there’s no focus on things like bias or fairness?

Denny Lee (10:41):
Well, so for example, if you look at almost any business that whether it depends on data or AI, it’s sort of exactly to the same point, right? There’s going to be biases in terms of who’s purchasing what, right? Yes, there’s going to be eventually data that goes into it. But often, we focus just on the data aspect as opposed to just even like, “Hey retailers, how are you hiring people? How are you promoting people?” Right? For that matter, not just retailers, even just businesses in general, right? The fact that it’s very clear that there are such things as glass ceilings, It’s very clear that there are such things as ceilings for people of color. Right.

Denny Lee (11:21):
So, I’m just curious from that standpoint, like why specifically the data and AI space? Or is it just because we’re in everything anyways?

Diana Pfeil (11:28):
I see your point. I actually think that that’s kind of scrutiny and those questions on other industries or in other decisions are part of the reason people are caring about this around AI. No one was asking 15 years ago when I was working on machine learning about the impact of our models, but people were talking about the glass ceiling. So, I think it is a little bit related. I think there’s also scrutiny now on brands if they’re inclusive within their advertising. I think like these questions around bias are being asked across industries just as much as for machine learning and AI, I guess I will add though that maybe one reason why it’s definitely critical to think about this stuff for machine learning and AI is that these kinds of algorithms have so much scale. Like they can impact everyone in the world or huge groups of people. Whereas other decisions, if it’s just hiring practices at one company, it’s that the scale and the impact just aren’t as huge.

Brooke Wenig (12:32):
That definitely makes sense that with data and AI, you can automate a lot of these decisions. And so the scale is definitely a factor versus an individual decision made by one hiring manager at one company doesn’t have that type of impact. I do feel for a lot of people in other industries outside of tech though, because in terms of fairness, it’s not just about whether or not I get approved for a loan, but it’s things like the pay gap that’s present or things like healthcare, black women in the US when they give birth are three times more likely to die of pregnancy, even for the same socioeconomic group as white women. And so there are systemic problems in other fields. What advice do you have coming from the tech field and having these different tools that you use to combat bias that you would recommend for other industries to leverage?

Diana Pfeil (13:18):
Oh, wow. That’s a really tough question. I think awareness and is really huge. What I think… Okay. So let me step back a little bit. I think at least when it comes to machine learning, part of the evolution is I think the following, were when we started with machine learning, the idea was that we could actually bring fairness into the world instead of to your point, Brooke, like a single person sitting in an office, smoking a cigar and making a credit decision about me with all of their inbuilt biases, we now can do this in a more fair way, where we take attributes that everyone hopefully can see and understand about themselves that they can maybe improve on and it’s not just one single person with their biases making a decision. So, it should in theory protect the world a little bit from that situation.

Diana Pfeil (14:12):
But of course, what could happen is that you actually amplify societal biases at a larger scale and that’s the risk. And so asking those questions, which is how are we? How is this algorithm or approach amplifying the systemic biases that already exist in society as a question that we should ask when we’re designing algorithms. But really everyone can ask when they’re creating a product or an experience, whether that’s an experience of giving birth at a hospital or something else.

Brooke Wenig (14:48):
You bring up a really interesting point about people should know what attributes are being used? Do you think these regulatory bodies are ever going to release the models that they use of like, “Oh, it’s 0.2 times your credit score, plus 0.3 times that.” Do you think they’re ever going to release the models so people can better understand themselves why they didn’t make the cut?

Diana Pfeil (15:07):
I think they absolutely will because that’s part of transparency, which is one of the pretty basic tenet of AI ethics, besides fairness. People should know what, why the decision is happening. So, explainability and transparency is part of that. In credit, that happens if you get denied for a credit decision, you have to be told why. It might not be because it’s like 2.5 your income plus whatever, but it should be like the top three factors that wade into a negative decision. So you have to be able to explain your decisions at least for credit. And I think that’s coming for other impactful decisions that aren’t necessarily in less regulated industries. Or at least there’ll be a lot of pressure on companies if I’m getting a decision plate made about me, I should be able to find out why just like I’m able to now email a company and ask what data they have about me if I’m European.

Denny Lee (16:09):
So, while this is… By the way, I’m in complete agreement with you, but I’m going to just pivot a little bit and to say, but if you go ahead and release the models, a lot of companies are going to say this, especially, “Now you can game the system. You can beat the system.” So, wouldn’t releasing the model actually cause more problems because of the fact that now you’re encouraging folks to basically game or be the system?

Diana Pfeil (16:34):
There should be a positive feedback loop in models like that is actually another tenet of a good system, which is if there’s a decision made about me, it shouldn’t be static. It should be able to course-correct. If the decision was wrong about me, next time, I should know that that company is doing a good job in order to decrease, say their false positive rates or whatever. By the way, I’m talking it like this is what I feel should happen, but I think we’re far away from it although we’re making steps as a society, but this notion of not having negative feedback loops is something that you want in a machine learning system. I guess to your question, are we giving away all our trade secrets? Is that really the question?

Diana Pfeil (17:29):
I think there’s two types of decisions. There’s algorithms that impact my life in a fundamental way, like my basic needs, so like my access to housing, credit, jobs, things like that. And then there’s decisions that are a little bit more in the gray area. And I think those will take longer to really become transparent or have explainability about them. Like, “Why did I get served this ad?” I’m not sure how long it’ll take for me to find out, but I do think that this comes back to privacy. I would love to be able to control the decisions that are made about me by AI systems.

Denny Lee (18:14):
Oh, absolutely. It goes into this idea that right now, because everything’s given away for free, we are the data, we are the product per se. And exactly to your point about data ethics and privacy in general, we’d like to not be the product right now. So, I’d love to go ahead and dive a little bit in terms of like, well then you sort of explain those tenets. Are there other tenets that you think that in terms of data ethics that you would like to bring up that we haven’t really covered yet?

Diana Pfeil (18:43):
So, I just want to say I’m taking these a little bit from Cathy O’Neil’s book, which is Weapons of Math Destruction, which you might’ve read, which is probably one of the earliest books that asked this question of what is the impact of our algorithms. And in there she had four principles and we’ve covered them. One is scale. So how many people do you impact? Two is fairness. So we talked about that. Then there’s transparency and the feedback loop. So yeah, those are the four and we’ve now covered all of them. I think maybe there’s ones I’m forgetting.

Brooke Wenig (19:20):
So a second ago, you were talking about explainability. And so I’m curious what your thoughts are on using tools like SHAP and LIME that gives you these locally accurate explainability models, but they’re not globally accurate. So you can’t just have a single SHAP model to be able to represent your entire data. Like, no, that’s your underlying neural network, your underlying tree-based model. I’m just curious from the regulatory perspective, are you okay with having something that’s locally accurate, but it’s not able to explain globally what is happening with your data?

Diana Pfeil (19:54):
Okay. Yeah. I see what you’re saying. I think of SHAP is very powerful because you do have some information globally. Like you can look at all of your training data and look at your SHAP graph and kind of see like stepping out what’s happening across your entire data set. I’m not sure. I feel like it’s not. SHAP is so powerful. Before SHAP, you just couldn’t… You could make up things related to feature importance, but they were problematic and weren’t locally accurate. So, I do think it’s more powerful to have locally accurate information or something like SHAP. But maybe there’ll be more solutions around explainability. I think explainability is pretty young and there’s going to be more libraries and more solutions that will help us just get into what our algorithms are doing. And that’s a good thing.

Brooke Wenig (20:52):
Do you think we’re ever going to get to the point where if I ask, “Why was I denied for this loan?” They’re just going to send me back the SHAP values. Do you think we’re ever going to get to that very nice visualization or do you think that we have to provide some more education and explaining to people like what exactly an algorithm is? Like, “Hey, this is linear regression model.” Or, “Hey, this is an ensemble of trees that we used.” What do you think needs to be provided back to the end users so they better understand how the algorithm made its decision?

Diana Pfeil (21:18):
I don’t know. Sending the general population the SHAP values feels a little bit like a can of worms, but I personally would… Like how cool would that be? Like send me the SHAP values for every single decision. I really think that just being able to understand the top five contributors to a decision is already very powerful. Knowing the exact SHAP values doesn’t feel that necessary. And like you said, it’s locally accurate. I don’t know. And it depends on the rest of the training set too. So I think going into those weeds might not be helpful for the general population.

Denny Lee (21:58):
Well, that’s completely true. You have to admit it. Getting a bunch of SHAP graphs actually looks pretty nice.

Diana Pfeil (22:03):
Yeah. They’re very good at visualization those folks.

Denny Lee (22:06):
Absolutely. Absolutely. But let’s switch gears a little bit again, as we’re coming to the ninth inning here. What are some of the most pressing issues from your perspective caused by that lack of data ethics?

Diana Pfeil (22:21):
I think there’s two areas. One is what I talked about a little bit before, which is we’re making decisions around people’s fundamental needs. And if those are unfair or those encode society’s biases, then we’re just going to end up in a more unequal world, which isn’t good for anyone. It creates a lot of conflict. It limits people’s opportunity. It increases economic disparity and all of that. So, that’s not great. That’s terrible. There’s so many ways that we have machine learning systems making decisions, and we just have no idea what the impacts and the harm is. One that I have no idea about is the fact that Google search uses BERT, which we know has a lot of gender biases and a lot of racial biases and really all of society’s and Reddit’s biases around people are all in BERT.

Diana Pfeil (23:21):
And now, every time I do a search result, that is somehow potentially encoded in the results and that make I don’t like that, but also I’m not sure exactly what the harm there is. So, I feel like there’s just a whole host of little things that we’re not even sure what the impact there is. And that’s something I worry about.

Brooke Wenig (23:42):
So I know you mentioned Cathy O’Neil’s book, I’m just curious, what are some other folks in the field of data ethics that you follow?

Diana Pfeil (23:48):
Okay. Well, I’ve been off Twitter since the pandemic because I’m just trying to stay sane, but I think Twitter was a place that I really appreciated following people and staying on top of things. One person who I follow is Rumman Chowdhury. I’m not sure how to pronounce her name. She just got hired as a director of AI ethics for Twitter. And previously she ran parity.ai which is an algorithmic audit company. She just really stays on top of what’s going on in this space and I enjoy following her. And then there’s Algorithmic Justice League and Joy Buolamwini, I’m so bad with names and I really follow what she’s doing in that space too.

Brooke Wenig (24:38):
Awesome. Well, thank you for those recommendations, Denny, and I will have to definitely check those out. Some folks that I personally follow, Timnit Gebru. I know she was pretty popular this past year, but she did present at our Spark + AI Summit two years ago on the field of data ethics. I also really enjoy Rachel Thomas. She had an excellent data ethics course that she taught through USF. Denny, who are some of the folks that you like to follow?

Denny Lee (25:00):
Honestly, I’m still learning right now. So, the fact that I’m talking to you, Diana, and the fact that I’m listening to this vidcast, podcast is pretty much where the beginning is for me. I’ve been pretty much stuck in the privacy estate. And so, that’s not really where… And I’m only now learning just how brutal the ethics is, hence the whole reason why it was super interesting to talk to you, Diana.

Brooke Wenig (25:24):
All right. So I’m going to go ahead and close us out for the day because I know Diana has to get to a very important 9:00 AM. And so I just want to say thank you so much for joining us and sharing all of your viewpoints on data ethics, educating us on the various tenets. And thank you again for taking time out of your day to join us.

Diana Pfeil (25:38):
Thanks. It was great being here.