Cristine Dewar is a data scientist on the Data Science fraud team at Affirm. She is currently working on models to prevent fraud. Cristine is passionate about fair and explainable ML and using data science to improve lives.
Shapley algorithm is an interpretation algorithm that is well-recognized by both the industry and academia. However, given its exponential runtime complexity and existing implementations taking a very long time to generate feature contributions for a single instance, it has found limited practical use in the industry. In order to explain model predictions at scale, we implemented the Shapley IME algorithm in Spark. To our knowledge, this is the first spark implementation of the Shapley algorithm that scales to large datasets and can work with most ML model objects.