At Home Trust, we measure success in terms of relationships. Whether we’re working with individuals or businesses, we strive to help them stay “Ready for what’s next.”
Staying one step ahead of our customers’ financial needs means keeping their data readily available for analytics and reporting in an enterprise data warehouse, which we call the Home Analytics & Reporting Platform (HARP). Our data team now uses Databricks Data Intelligence Platform and dbt Cloud to build efficient data pipelines so that we can collaborate on business workloads and share them with the critical partner systems outside the enterprise. In this blog, we share the details of our work with Databricks and dbt and outline the use cases that are helping us be the partner our customers deserve.
When it comes to data, HARP is our workhorse. We could hardly run our business without it. This platform encompasses analytics tools such as Power BI, Alteryx and SAS. For years, we used IBM DataStage to orchestrate the different solutions within HARP, but this legacy ETL solution eventually began to buckle under its own weight. Batch processing ran through the night, finishing as late as 7:00 AM and leaving us little time to debug the data before sending it off to partner organizations. We struggled to meet our service level agreements with our partners.
It wasn’t a difficult decision to move to Databricks Data Intelligence Platform. We worked closely with the Databricks team to start building our solution – and just as importantly, planning a migration that would minimize disruptions. The Databricks team recommended we use DLT-META, a framework that works with Databricks Delta Live Tables. DLT-META served as our data flow specification, which enabled us to automate the bronze and silver data pipelines we already had in production.
We still faced the challenge of fast-tracking a migration with a team whose skill sets revolved around SQL. All our previous transformations in IBM solutions had relied on SQL coding. Looking for a modern solution that would allow us to leverage these skills, we decided on dbt Cloud.
Right from our initial trial of dbt Cloud, we knew we had made the right choice. It supports a wide range of development environments and provides a browser-based user interface, which minimizes the learning curve for our team. For example, we performed a very familiar Slowly Changing Dimensions-based transformation and cut our development time considerably.
Every batch processing run at Home Trust now relies on Databricks Data Intelligence Platform and our lakehouse architecture. The lakehouse doesn’t just ensure we can access data for reporting and analytics – as important as those activities are. It processes the data we use to:
In short, if our batch processing were to get delayed, our bottom line would take a hit. With Databricks and dbt, our nightly batch now ends around 4:00 AM, leaving us ample time for debugging before we feed our data into at least 12 external systems. We finally have all the computing power we need. We no longer scramble to hit our deadlines. And so far, the costs have been fair and predictable.
Here’s how it works from end to end:
None of this would be possible without intense collaboration between our analytics and engineering teams – which is to say none of it would be possible without dbt Cloud. This platform brings both teams together in an environment where they can do their best work. We’re continuing to add dbt users so that more of our analysts can build proper data models without help from our engineers. Meanwhile, our Power BI users will be able to leverage these data models to create better reports. The results will be greater efficiency and more trustworthy data for everyone.
Within Databricks Data Intelligence Platform, depending on the team’s background and comfort level, some users access code through Notebooks while others use SQL Editor.
By far the most useful tool for us is Databricks SQL – an intelligent data warehouse. Before we can power our dashboards for analytics, we have to use complicated SQL commands to aggregate our data. Thanks to Databricks SQL, many different analytics tools such as Power BI can access our data because it’s all sitting in one place.
Our teams continue to be amazed by the performance within Databricks SQL. Some of our analysts used to aggregate data in Azure Synapse Analytics. When they began running on Databricks SQL, they had to double-check the results because they couldn’t believe an entire job ran so quickly. This speed enables them to add more detail to reports and crunch more data. Instead of sitting back and waiting for jobs to finish hanging, they’re answering more questions from our data.
Unity Catalog is another game changer for us. So far, we’ve only implemented it for our gold layer of data, but we plan to extend it to our silver and bronze layers eventually across our entire organization.
Like every financial services provider, we’re always looking for ways to derive more insights from our data. That’s why we started using Databricks AI/BI Genie to engage with our data through natural language.
We plugged Genie into our loan data – our most important data set – after using Unity Catalog to mask personally identifiable information (PII) and provision role-based access to the Genie room. Genie uses generative AI that understands the unique semantics of our business. The solution continues to learn from our feedback. Team members can ask Genie questions and get answers that are informed by our proprietary data. Genie learns about every loan we make and can tell you how many mortgages we funded yesterday or the total outstanding receivables from our credit card business.
Our goal is to use more NLP-based systems like Genie to eliminate the operational overhead that comes with building and maintaining them from scratch. We hope to expose Genie as a chatbot that everyone across our business can use to get speedy answers.
Meanwhile, the Databricks Data Intelligence Platform offers even more AI capabilities. Databricks Assistant lets us query data through Databricks Notebooks and SQL Editor. We can describe a task in plain language and then let the system generate SQL queries, explain segments of code and even fix errors. All of this saves us many hours during coding.
Although we’re still in our first year with Databricks and dbt Cloud, we’re already impressed by the time and cost savings these platforms have generated:
With more than 500 dbt models in our gold layer of data and about half a dozen data science models in Databricks, Home Trust is poised to continue innovating. Each of the technology enhancements we’ve described supports an unchanging goal: to help our customers stay “Ready for what’s next.”
To learn more, check out this MIT Technology Review report. It features insights from in-depth interviews with leaders at Apixio, Tibber, Fabuwood, Starship Technologies, StockX, Databricks and dbt Labs.