customer story
Secure and personalized payment options for customers at scale

INDUSTRY: Financial services

SOLUTION: Fraud detection, recommendation engines, risk management, transaction enrichment

PLATFORM USE CASE: Lakehouse, Delta Lake, data science, machine learning, Databricks SQL, ETL

CLOUD: AWS

 

Bread, a division of Alliance Data Systems, is a technology-driven payments company that integrates with merchants and partners to personalize payment options for their customers. As their data streams grew from GBs to TBs, their existing Snowflake implementation struggled to efficiently process in near real time. With the Databricks Lakehouse Platform, the company can analyze data rapidly — without incurring high compute costs — and build reliable, high-performance pipelines that support both batch and streaming at scale. With this data at their fingertips, Bread is able to streamline operations and mitigate risk across multiple use cases, from automating application processing and credit risk analysis to recommending products to drive customer loyalty and lifetime value.

The complexity of siloed data sources for multiple use cases

Bread is a technology-driven payments company that integrates with banking partners and over 400 merchants to create personalized and flexible financing and payment options for more than 1 million customers. The Bread platform, which runs on AWS and consists of several dozen microservices (including application, checkout and payments) allows merchants to offer more ways to pay over time, resulting in improved conversion rates and higher average-order-value.

With plenty of big data use cases, such as financial reporting, fraud detection, credit risk, loss estimation, and a full-funnel recommendation engine, the data team behind Bread needed a way to bring together siloed data sources for their analysts and data scientists to gain a complete view into how best to serve their customers. But their existing Snowflake ingestion pipeline struggled to keep up with the scale of platform deployments and the explosion of data.

Prior to Databricks, Bread was using daily data dumps from CSV files into Snowflake, a process that didn’t suit 40~70 GB workflows. These performance bottlenecks had a significant impact on their ability to perform release testing and service their partners and analytics team with more real-time data.

“We couldn’t afford the time it took to transfer data to Snowflake,” explained Christina Taylor, Staff Data Engineer at Bread. “A job that typically took an hour and a half meant that no one else could do business analytics during that time. Due to the data volume, the naive ingestion pipeline—from data dump to Snowflake—struggled to scale.”

Further, Bread’s varied use cases revealed a collaboration challenge. Analytics and reporting work primarily with dbt, while engineers develop with python/scala and focus on ingestion. This created a significant dependency on data engineering. Additionally, engineering often lacks a deeper understanding of business use cases or what kind of transformation was needed.

Moving to the Lakehouse to accelerate data democratization

Bread migrated to the Databricks Lakehouse Platform on AWS to efficiently ingest transactional data from their point-of-sale systems and ELT into Delta Lake, while supporting their analytics engineers and data scientists to democratize their datasets for credit risk, loss estimation, and fraud use cases.

Now, the team can ingest an initial snapshot of data stored in S3 and perform ongoing replication and upserts into Delta Lake. The Spark jobs process all the databases and tables needed for analytics in parallel. Ingestion that once took 90-minutes now takes just 10 minutes to complete.

On the collaboration front, Bread is getting started with Delta Live Tables and Databricks interactive notebooks to make working together easier. “The ability to mix languages in a single pipeline means data engineering, analytics and data science teams can effectively work together,” added Taylor. “Within hours, they will be able to start from a raw data set, create a work in progress model to play with, as well as practice machine learning at scale.” With Delta Live Tables, Bread’s data engineering team can easily and quickly deliver high-quality data on Delta Lake to those that need it most — saving them time and reducing operational complexity.

With data flowing seamlessly, engineers also leverage Databricks SQL Connector to discover and decrypt customer personal identifiable information (PII) data for use cases such as payment processing and transaction reporting.

Better insights (plus a massive reduction in compute costs)

With Databricks, Bread is now able to scale their data platform without worrying about data volumes slowing ingestion and performance. As a result, they are able to analyze TBs of data for downstream business reporting, analytics and ML use cases, designed to improve business decision making and the customer experience. This rapid ingestion of both batch and streaming data has helped increase the availability of actionable insights for business reporting by 20%.

In addition to better insights and reducing workflows from 90 minutes to 10 minutes (a 90% improvement), Bread has found a significant cost reduction in compute thanks to Databricks Autoloader. Rather than running streaming clusters all the time, they are now able to use streaming triggers and provision compute resources on-demand with autoscaling clusters, helping them to handle 140x the data at only 1.5x the cost.

“Our vision for the future is for the Lakehouse to be our single source of truth for all company facts,” said Taylor. “In addition to streaming Autoloader ingestion, we also plan to use events to build a complete core data layer in Delta Lake where we can validate schema, run transformations for analytics, and use Delta Live Tables to simplify our ETL pipelines for near real-time decision making. Having a platform as robust as Databricks has really laid the foundation for our future success.”

  • 140x
    more data at 1.5x the cost
  • 90%
    reduction in data processing time
  • 20%
    increase in actionable insights for reporting

“Moving from Snowflake to Databricks’ lakehouse has transformed the way we drive business actions based on the most complete and current view of our data, something that wasn’t possible with our data warehouse before.”

– Christina Taylor, Staff Data Engineer, Bread