Skip to main content

This is a guest blog from our one of our partners: Tresata


Tresata and Databricks announced a real-time, Apache Spark and Hadoop-powered Anti-Money Laundering solution earlier today. Tresata’s predictive analytics application TEAK, offers for the first time in the market an at-scale, real-time AML investigation and resolution engine.

The performance, speed, predictive power and precision TEAK delivers would not have been possible without its Spark underpinnings.

Additionally, by being one of the first business applications to be certified to run on Databricks Cloud, Tresata’s TEAK is breaking new ground and offering Banks, Retailers, Telcos & Regulators the only quick start, rapidly scalable AML solution.

Tresata has always been at the leading edge of developing industry leading data analytics applications for complex, critical and crucial business processes. It was the first analytics application company powering its entire software with Hadoop, was the first to use cascading, the first to use scalding, and the first to use Spark. In keeping with the ‘trend’, we are proud to be the first to use the Databricks Cloud for deploying a core industry application.

When our application engines became one of the first to be “Certified on Spark”, we immediately saw the massive productivity boost our customers achieved leveraging a 100% Spark-powered analytics platform. This gave us an incentive to work closely with Databricks to also enable one of our recent analytics applications to work on the Databricks Cloud, especially given the enormity of the challenge.

Tresata has always believed that new big data technologies will unleash trillions of dollars of economic value. Money laundering is just one such area where it by itself is a trillion dollar problem…annually.

The illegal process of taking ‘dirty’ money and making it ‘clean’ requires passing funds through an intricate and interconnected network of people, places & things and their inherent but otherwise unseen interconnections. Why is it that current AML solutions have been solely focused on entity-level (person, business, corporation) or transaction level risk scores, without viewing them within the context of the greater network risk score?

The short answer is a lack of technological prowess that unites all dimensions of accurately predicting AML. And in real-time.

Until now.

Recognizing this fatal flaw in current AML solutions - taking an entity-level look at what is a massive network problem – Tresata incubated TEAK in its R&D lab almost two years ago. Powered by the only real-time Spark certified network discovery, traversal and query engine (Tresata ORION), Tresata successfully deployed its advanced algorithms and capability to look at entire networks in real time to bring a new dimension to this problem – Tresata Network Scores.

These scores are at the heart of TEAK’s success at precisely predicting fraudulent transactions from not just entity dynamics but based on the entire supply-chain of money movement.

This required some breakthrough technological innovations, which would not have been possible without Spark, namely:

  • Interactive Real-Time Scoring Engine: in memory RDDs & optimized data structures provide incredibly fast refresh response (typically few seconds, no more than 20 seconds for complex queries)
  • At-scale immediate response graph traversal: with an easy to use query language (QUE – our SQL-like graph query language) and a property based graph model one can do multi-hop traversals in network analysis
  • Graph Search Engine: Ability to scan the entire dataset and enabled to do more than just point queries
  • Go beyond a few thousand of nodes: at scale graph engine that works for > 50MM entities, hundreds of millions of edges and doesn’t skimp on performance

Tresata is excited to partner with Databricks to provide an answer to one of the biggest challenges that governments, societies and institutions face today. It is our belief that with the rise of open-source, distributed computing, this collaboration with Databricks spurs a true business revolution – one that applies the massive power of Spark, Hadoop, clouds and algorithms to solving massive business problems.

Tresata’s AML solution - powered by Spark and delivered in Databricks cloud – delivers that power today.

Try Databricks for free

Related posts

The State of Apache Spark in 2014

July 18, 2014 by Matei Zaharia in
This post originally appeared in insideBIGDATA and is reposted here with permission. With the second Spark Summit behind us, we wanted to take...
See all Company Blog posts