Skip to main content
CUSTOMER STORY

Navy Federal uses real-time data to trailblaze member personalization

Delta Live Tables drives real-time omnichannel app monitoring for Navy Federal
Billions

of application events per month processed by DLT

9 billion

events streamed continuously for 9 months, 24/7, with ~100% zero maintenance

6-week

time to market for a new real-time omnichannel application

INDUSTRY: Financial Services
SOLUTION: Delta Live Tables,DB SQL
CLOUD: Azure

Navy Federal Credit Union was founded during the Great Depression by seven U.S. Navy employees who wanted to help themselves and their coworkers reach their financial goals. Today, Navy Federal Credit Union is the largest credit union in the world, serving 13 million member-owners. Navy Federal’s priority is to provide a personalized, omnichannel experience to their members. But to understand their members better, they needed to ingest and analyze online telemetry data in real time. To accomplish that, Navy Federal turned to Delta Live Tables (DLT), Databricks SQL and Microsoft Power BI.

Near real-time data introduces new insights

In June 2023, Navy Federal was about to start the process of migrating millions of users to the latest version of their online banking platform. The company planned to complete this migration in a controlled manner with small user groups, gradually expanding to all their members over several months. 

To make informed decisions, Navy Federal needed to closely track how members interacted with their product. To accomplish that, Navy Federal engineers created data pipelines to their data lake, and data analysts built dashboards and reports to visualize KPIs. The pipelines ran daily, so the information from the previous day would be available to leadership at the beginning of a new day. “The pipeline ran perfectly, but in today’s world, 24 hours is a long turnaround time,” Jian (Miracle) Zhou, senior engineering manager at Navy Federal Credit Union, said. “As we were moving closer to the first wave of user migration, we decided to enhance our data solution to introduce new insights in near real time.”

To accomplish that, the organization needed to create a near real-time dashboard with two KPIs: the number of unique members who were able to log into the new channel, and the number of unique sessions the channel served. Additionally, they needed to filter KPIs both by time and by member attribute, which was the indicator of which migration wave they belonged to. The expectation was that the dashboard would be updated continuously throughout the day with a latency of no more than 10 minutes. 

Zhou faced two primary challenges: time and scale. The organization only had about six weeks to build a solution from scratch and get it into production. “This was actually the first time my team tried to build a real-time solution in Azure Cloud, and we had to figure out everything in six weeks,” he said. “And scale-wise, we’re talking about billions of application events generated from millions of monthly active users, so the solution had to scale as the migration proceeded.”

Databricks DLT helps create reliable, performant data pipeline

Navy Federal’s data delivery solutions follow a straightforward path. One side is for engineers and is all about data ingestion and curation. For every data source, they build a data pipeline to ingest data into their data lake, transform it and make it ready to be consumed by customers. This part of the equation is very dynamic because of the diversity of its data sources. The other side is for data analysts and is all about data serving and visualization. Navy Federal uses Azure Data Lake Storage as their storage layer, Databricks SQL as their analytical query engine and Power BI for data visualization. Together, they form a simple but effective foundation for data analytics. 

The source of the data Navy Federal needed to work with was the event telemetry generated in their online banking application.

Navy Federal Credit Union faced challenges using Azure Application Insights for dashboards due to the lack of access to member attribute data. To address this, they streamed telemetry from Application Insights to their data lake using Azure Event Hubs, which supported the Kafka protocol for low-latency, high-reliability data consumption without requiring coding.

To process the streamed data, Navy Federal used Databricks Delta Live Tables. Zhou and his team used DLT to create a simple data pipeline by writing three query functions and two change data capture functions. The first function they wrote connects to the event hubs, applies some transformation and returns a data frame. The second query function takes complex JSON documents and flattens them into individual application event records. They then used DLT’s APPLY CHANGES function to remove duplicates. In the third query function, they filtered the data down to just the login events and then selected only the columns relevant to login. Last, they used the APPLY CHANGES function again to ensure there aren’t duplicates in the login event table. Finally, they helped DLT understand which data should be persisted and which can stay in memory (but not accessible to the outside world) when a pipeline runs. The team then used the Expectations feature to set data-quality expectations and monitor the health of that data. “That’s all — three query functions and two CDC calls to complete all four steps. Simple as that,” Zhou said. “For me, that means low-code complexity and short development time.” 

Critically, the pipeline Zhou and his team created scales aggressively, which was important as the omnichannel user migration progressed. “We started with a very small wave and kept increasing the migration size wave after wave. We were anticipating that would increase the demand for compute resource, but we were not sure exactly how much was needed for every wave. So, when we deployed this data pipeline to production, we turned on a feature called Enhanced Autoscaling,” Zhou explained. “The processing volume went up quickly, from just a few hundred users and a few thousand sessions every day to millions of users and billions of application events per month. This pipeline automatically scaled the resources depending on the data volume. Our production support engineers never had to intervene. They practically forgot about it.”

The pipeline was also reliable and fault-tolerant. It ran 24/7 continuous streaming for nine months and successfully processed about 9 billion application events with almost-zero maintenance. “On some days there were hiccups that caused temporary interruptions,” Zhou said. “Those hiccups were the natural result of being in the cloud environment. The great thing is that the pipeline showed great resilience. Without exception it automatically recovered from those transient errors.”

The pipeline was also performant. Even as event volume data skyrocketed, the pipeline kept up. “The streaming processing speed and the query response time were to our satisfaction. Part of the reason was that DLT automatically optimizes, compacts and vacuums the tables it manages,” Zhou said. “This effectively eliminates or reduces the small file problem and removes files no longer needed from our storage account, which leads to better query performance.”

The last mile of the journey was the delivery of the dashboard. Databricks SQL allowed them to curate data in Power BI with speed and scale. “Our analysts connect Power BI to the data in our data lake through the Databricks SQL endpoint, build Power BI semantic models and create insight for visualizations,” Zhou explained. “Our dashboard directly queries Databricks. When the visuals are rendered, it brings back the latest updates to users in real time.”

DLT gets new type of workload to production in record time

Using DLT, Navy Federal was able to complete a proof of concept in a week; develop, test and figure out a CI/CD process in three weeks; deploy the pipeline to production just before the start date of the first wave migration; and release the dashboard just a few days later. “The simplicity of the DLT programming model combined with its service capabilities resulted in an incredibly fast turnaround time,” Zhou said. “It truly allowed us to get a whole new type of workload to production in a record time with good quality.”

The release of the real-time user activity monitoring dashboard combined with historical views revealed long-term trends, which received overwhelmingly positive feedback from Navy Federal’s senior leadership and stakeholders. 

“DLT hides the complexity of modern data engineering under its simple, intuitive, declarative programming model,” Zhou said. “As an engineering manager, I love the fact that my engineers can focus on what matters the most to the business. Delta Live Tables can take care of what is underneath the surface and ensure not only that our pipeline was built and deployed in time, but also held up beautifully in the long run.”

After the first version of the new pipeline went to production, Zhou and his team continued to innovate, using DLT to develop and deploy a homegrown metadata-driven framework. 

Zhou said the three things he likes best about DLT are the simple declarative programming model that accelerated speed to market; built-in scalability, optimization and reliability, which simplified operations; and accessibility through the API, which enabled space for engineering creativity and scalability.