Get started and build a credit data platform for your business by visiting the demo at Databricks Demo Center.
According to the World Bank's reporting on financial inclusion, a staggering 1.7 billion adults were deemed underbanked. Many underbanked individuals find it difficult to secure loans from traditional financial institutions, leading them to turn to informal lenders who offer loans at exorbitant interest rates. This group typically includes younger generations, low-income individuals in developing nations, and rural residents, many of which have gone mobile in order to gain financial access.
When it comes to the underbanked, mobile banking has typically stepped in to meet the consumer needs in areas where traditional banking is perceived to be weak. The number of smartphone users worldwide has consistently grown by a minimum of 5% annually over the past five years, presenting a new and promising opportunity for lending. Financial institutions need to leverage this opportunity by utilizing machine learning and other advanced analytics to assess a customer's creditworthiness and gradually build up a credit history through their platforms, expanding the scope of financial inclusion and opening doors to previously unattainable credit opportunities.
In the spirit of financial inclusion and expanding traditional thinking, this blog serves as a guide and reusable public Lakehouse demo for how banks, fintechs, and non-banks can enter the low-hanging fruit markets that are waiting and eager for better financial services.
As Deloitte points out in their report on financial inclusion, 'doing well and doing good are not mutually exclusive'; this is resonating with many data teams in the industry. Let's define some terms to understand this concept better.
Credit decisioning is the process of assessing an individual's creditworthiness to determine their ability to repay a loan or credit. It is an essential part of the lending industry and involves various stages, including data collection, data processing, and data analysis and loss estimation. Traditionally, credit decisioning has been a lengthy process–even for short-term loans–which are the types of loans most commonly purchased by the underbanked. Moreover, the process is heavily biased towards those individuals with prior credit history or long-term loans. With the advent of buy-now-pay-later (BNPL) offerings, digital markets for home purchases, and non-banks offering credit, the world stage for credit decisioning has completely transformed.
As AI-assisted credit decisioning continues to advance, the banking and payment industries are witnessing a surge in customer demands for a Databricks Lakehouse design. This design offers a credit data platform that provides a holistic and efficient solution to the credit decisioning process. The platform can enable data integration, audit, AI-powered decisions, and explainability, providing a single source of truth for data analytics. The credit data platform includes machine learning models that can analyze vast amounts of data and provide more accurate predictions about a borrower's creditworthiness, improving the speed and accuracy of the credit decisioning process. The credit data platform can help fintechs, banks, or non-banks looking to offer financial services make informed credit decisions, reduce the risk of default, and offer better rates and terms to their customers. Before delving into the technology solution, we will cover the areas in which financial institutions are struggling to serve markets today.
Part I - Why Change?
Implementing a credit data platform can be a significant challenge for banks and other financial institutions. Consider the following reasons.
Many underbanked individuals find it difficult to secure loans from traditional financial institutions, leading them to turn to informal lenders who offer loans at exorbitant interest rates. Credit decisioning for underbanked customers can be challenging, as these individuals may not have a traditional credit history or financial records that can be used to assess their creditworthiness. Furthermore, credit decisioning data is often stored across different sources and incompatible formats, making it difficult for data users to fully merge together and extract valuable insights. This results in data only being available to data engineers and scientists, but not to end users such as marketing and finance teams, call center agents, and bank tellers.
Banks and other financial institutions face significant challenges when building a credit data platform. They must ensure that the platform is secure, compliant with regulatory requirements, and protects sensitive customer data. Achieving these goals requires addressing various challenges related to security and governance, such as data privacy, access control, quality, and compliance. However, data governance and enterprise security control can be challenging due to the complexity of data ecosystems, evolving threats, insider risks, and resource constraints. To effectively manage and secure their data, organizations must address these challenges at the foundation - it cannot be an afterthought.
Explainability and fairness are essential in credit decisioning because they promote unbiased and understandable decisions that protect consumers from discrimination and ensure equitable outcomes. Lack of fairness and explainability can erode trust in the credit system and discourage consumers from applying for credit. However, evaluating fairness in credit decisions and explaining outcomes can be challenging due to several factors. These include the complexity of credit scoring models, potential data biases, and potential for human biases.
In this blog, we demonstrate how setting the right data foundations through the Databricks Lakehouse can address the aforementioned challenges and enable companies to create better credit models and achieve their business goals, including serving their underbanked customers, assessing credit risk and exposure, introducing novel products such as buy-now-pay-later, and others.
Good credit models require a wide variety of data depicting the bank customers from as many angles as possible, including their spending habits, potential previous delinquencies, sources of income, and many more. We report on the left hand side of the picture the different financial data sources we need to create a modern credit decisioning platform, including credit bureau data, customer information, real-time transactional data, as well as partner data (telecom data that we use to augment the traditional banking information). It is easy to see that all data sources have totally different file formats, velocity of ingestion, volume, and source platform.
To solve the variety challenge, we begin with Data Unification - the ability to ingest any source of data in a single source of truth location.
Once the proper data foundation has been set, we can move to Data Decisioning and find the hidden patterns and correlations we call "data insights":
Nowadays, data is accessible and usable only by the data teams, such as data scientists and data engineers. Data teams, however, are not the end users of a use case, such as the credit decisioning - it is the credit agents evaluating an application, call center agents communicating with a customer, or marketing teams preparing promotional materials for upselling the underbanked customers. These personas, however, more often than not, do not have access neither to the data nor to dashboards or machine learning predictions. In the old days, data teams would export any requested data to csv or pdf files and send it to the business users over email. This approach is not secure, scalable, or simple.
Unity Catalog and Databricks data warehousing solution, Databricks SQL, allows financial services organizations to "democratize" their data and insights and allow access to them by not only data users but everyone in the organization through capabilities such as the BI visualizations and Delta Sharing, an open protocol for securely live sharing of any data with no replication, centralized governance, an cross-platform recipients.
The combination of data and user unification, actionable decisioning, and data democratization are the fundamentals of the Databricks Lakehouse for Financial Services. It is the ultimate democratization of data access without sacrificing security and governance, as we will show.
To start our tour of the Lakehouse demo for credit decisioning, we want to show the impact any financial institution can achieve. In unifying our data and making it available for analytics, we are driving business outcomes that bring in new clients, a win for both FSIs and prospects alike.
Through dashboarding capabilities enriched with customer lifetime values models (CLV), we can easily report the financial benefits of identifying and serving creditworthy customers (underbanked) who currently do not have any credit instruments with the bank. The dashboard combines raw data, machine learning predictions, as well as explainability information, not only identifying the probability of default for each underbanked customer but also the top three reasons unique to each customer, making it very actionable for credit agents evaluating the creditworthiness as well as the marketing team communicating with the customers. Finally, as reported below, we also offer a way to assess the fairness of our credit scoring models and make sure we do not disadvantage any groups of customers.
Part II - How to Serve More Clients with the Lakehouse Architecture
In this section we will go even deeper into the technical implementation and architecture of the credit decisioning demo and see how the Lakehouse helps financial organizations use their data to achieve their business goals.
The picture above depicts the actual architecture of the credit decisioning solution and shows how we achieve the aforementioned goals, including data unification, governance, and democratization.
Such end to end data lineage is extremely critical for understanding compliance, audit, observability, and discoverability of data.
These are three very common scenarios, where full data lineage becomes incredibly important:
Some of the most customer obsessed inventions over the last 20 years were underpinned by better automation. The iPhone introduced software to detect multi-touch instead of relying on manual hardware upgrades. PayPal revolutionized payments by leveraging the peer-to-peer network. And GPT-3 has changed the world by automating sophisticated text generation that has permeated our daily lives outside of work. Ultimately, credit decisioning is benefiting from the same levels of innovation and automation. Instead of manually approving loans with incomplete data, any firm (bank or otherwise) can now extend credit to new individuals by automatically ingesting alternative data sources, governing PII to improve time to value, and automating the credit decisioning using ML and AI. The credit decisioning framework on the Databricks Lakehouse is designed to codify exactly the simplicity of this automation framework with software provided by Databricks.
To get started and build a credit data platform for your business, visit the demo at Databricks Demo Center.