Welcome to the Databricks Solution Accelerator demos – your fast preview of how to apply these pre-built notebooks based on best practices to solve common business problems. Today we will be looking at the Solution Accelerator for customer lifetime value.
customer lifetime value, just like all customer analytics, is a key area of focus for the lakehouse. Most organizations are coming to the realization that not every customer is equally profitable and it is important for them to understand who their good customers are, and who they’re not. So they can adjust investments in the good ones and bring everybody to net profitability. The standard tool for this is an estimation of customer lifetime value. Simply put, customer lifetime value is an estimate of the value that we expect to obtain from a customer. Whether that’s in terms of spend, or margin for each have some number of future periods, typically three to five years. Not knowing whether the customer will stay engaged for that time period, we have to incorporate a retention estimate into our considerations. And of course, we have to apply discounts for future revenues. Summing this all up, we arrive at a potential sum that we expect to obtain from a customer for the period of interest.
Click to expand the transcript →
Click to collapse the transcript →
Now, the challenge we have in most retail organizations is that we do not have contractual relationships with our customers. We instead have to take a look at the pattern of engagement that the customer establishes, and from there estimate retention and monetary value components to factor into a customer lifetime value, or CLV. The work for this was done by a series of researchers back in the late 1980s and then popularized in the 2000s. And now packaged today as a very popular open-source library known as Lifetimes. We’ll use that library and Databricks to help us estimate CLV in this Solution Accelerator demo.
Here we’re looking at part one of a two-part series where we tackle retention and the value components of CLV. Our blogs have our perspective on retention and value estimation and links to resources that are helpful as you explore what fits into your organization. We’re going to jump right into the code that’s available inside the attached notebooks.
These notebooks are built off of publicly available data sets and show us how we can go about estimating customer lifetime value. Here we’re looking at transactional history for individual customers. We have the customer’s unique identifier here, and we have the dates on which the customer was engaged. From this information, we can make a series of estimates (which we scroll down to here) of the frequency, recency and term of repeated engagement over the lifetime of the customers for this period. From that we can go through a process of building our models, again, examining the patterns that individual customers establish with us relative to the overall patterns in the population. We spent quite a bit of time in here exploring the nuances of the models that are produced and how to understand whether they’re applicable to individual businesses.
But let’s come down to sense this one image at the bottom that really captures what this model is doing. Here we’re looking at retention. And what we can see is that with each engagement as represented by these red lines, we’re gaining more and more confidence in our customers’ overall retention. Though as time proceeds, as we move between the engagements, there starts to become some doubt as to whether these customers continue to be engaged. But again, with each interaction, our understanding of the customer shifts, and the speed with which our competence degrades, changes as well.
Now, that’s the first part of our retention model. In the second part of our blog, we tackle the value part of this exercise. Again, we use the same data set. We’re going to come down here and spend some time focusing on the monetary component. We’re going to use sales amount, because we don’t have information about the cost of these goods – so we can’t look at margins. But some organizations might do that. Based on this we’re gonna come in to add to our set of metrics a monetary value metric, which captures the amount that was spent, and secondary engagement (so not the primary, but the follow-up engagement). From that, we can build a very similar model that we saw before. And when we bring together both our retention and our monetary value model, which I get to down here, we have the ability to estimate a CLV. Now here, I’m doing a 12-month CLV. But we can certainly look further out, and this table we can see how much we expect to obtain from each individual customer over the next 12 months.
This is super useful, but a little difficult for most organizations to employ. So we can go an added step to show how you can take this model, convert it into a function so that you can simply write a “Select” statement and use your model as a function passing in the pre-computed values to then make your CLV estimations as part of a query.
For organizations, being able to produce this output and then take it over to customer relationship management systems is a great way to start adjusting engagements with customers to increase overall engagement and net profitability over the lifespan of each individual customer. It also works best on the lakehouse.
Ready to get started with customer lifetime value? Click on the link in the description below to go right to our full write-up for this solution and gain a deeper understanding of how the Databricks Lakehouse uniquely solves for the challenges associated with going from batch to streaming and BI to AI.
Or visit the Databricks Solution Accelerator hub to see all our available accelerators as well as keep up to date with new launches.