Customer Case Study

Juntos Global is a financial technology firm that partners with global financial institutions to improve communications between bank and customer to improve customer satisfaction and loyalty.

Vertical Use Case

Data Preparation and ETL for their text messaging service between customers and financial institutions

Technical Use Case

  • Ingest/ETL

The Challenges

Juntos Global delivers a text messaging service with end customers of financial accounts on behalf of a financial institution. This requires exchanging GBs of account and customer data (log and flat files) which comes in varying formats and timeframes and then prepare the data for downstream analytics.

  • Complex Infrastructure: Legacy ETL data pipeline built in Pentaho pointing to a Postgres SQL database. These systems introduced too much DevOps work for a lean engineering team.
  • Tuning ETL Pipeline was Complex: Data analysts would use SQL to tune their ETL pipeline. Also, they dealt with a high volume of batch transfer of files with missing or incorrect data. As a result, there was a huge ongoing effort to rewrite things in Postgres.
  • Limited Scalability: They used R to explore data and perform analytics. However, since R is single threaded (runs on a single node), they were not able to analyze entire data sets and were forced to sample their data.

The Solution

Juntos Global selected the Databricks Unified Analytics Platform to simplify data engineering and accelerate downstream analytics.

  • Fully Managed Platform: Automating infrastructure management and removing the complexities of managing the analytics workflow allowed their lean team to focus on their core expertise rather than DevOps.
  • Unified Platform: Transitioned ETL pipelines to Databricks resulting in faster, more reliable pipelines at scale.
  • Support for PySpark: Able to leverage a Python-based model for performance tuning.

The Results

Juntos Global experienced significant operational efficiencies that allowed them to reduce time-to-value within their analytics pipeline.

  • Faster Data Processing – ETL jobs used to take hours on a sample of data. With Databricks, they are able to ETL their entire data set in minutes.
  • Flexible Architecture – Reduced the amount of DevOps needed to normalize the data and retain unique characteristics for the purpose of downstream analytics.
  • This increase in operational efficiency has improved productivity, allowing their team to focus more time on exploration and analytics.

Databricks has removed the technical barriers of normalizing our data for the purpose of downstream analytics. As a result, we’ve reduced our ETL process from hours to minutes.

Dante Cassanego
CTO and Co-founder at Juntos Global