Skip to main content
Data AI

Let’s talk regulation. While not the sexiest topic for banks to deal with, working with regulations and compliance are critical to financial institutions’ success. On average 10% of bank revenue is spent on compliance program costs and represents the largest cost for most financial organizations. Additionally, with the rising tide of regulations since the 2008 global financial crisis, financial service institutions (FSIs) and their chief compliance officers are struggling to keep pace with new regulations like the Fundamental Review of the Trading Book (FRTB), 2023 and Comprehensive Capital Analysis and Review (CCAR). These regulations, along with many others, call for better data management and risk assessment.

FRTB is a new regulatory compliance mandate that goes live January 2023. FRTB will force banks around the world to raise their capital reserves and data management practices to make them better prepared to withstand market downturns. To comply with these new measures, banks will need to aggregate data from many disparate sources to build FRTB reports and calculate capital charges, which will prove to be especially challenging for large banks with multiple front-office systems. Banks will also need to evaluate their market risk and capital sensitivity, which must be computed and integrated with the FRTB aggregation element.

IDG reports additional computational and historical data storage capacity is required to process unprecedented volumes of disparate data and accommodate real-time data ingestion. In fact, estimates from Cloudera cite that FRTB will require 24x boost in historical data storage and a 30x increase in computational capacity.

Therefore ,a FRTB challenge for financial institutions is the need to overhaul market risk infrastructure technology to dramatically boost scalability and performance. Banks that get it right may save millions of dollars from being tied up in capital reserve requirements. Data analytics at scale is a major pillar for banks in the rising tide of regulation. This blog discusses the need for scale and how the lakehouse provides a modern architecture for data-driven compliance in financial institutions.

Computing for modern compliance

To meet modern compliance requirements, FSIs need to report on growing volumes of data stretching years into the past. Risk calculations that were run weekly or daily must now be run several times per day, and in many cases, in real-time as new data comes in. Additionally, regulations like FRTB require risk teams to scale simulations for thousands if not millions of scenarios in parallel. The volume of data, reporting frequency, and scale of calculations require massive compute power that far outstrips the capabilities of legacy on-premises analytics platforms. As a result, compliance risk teams are unable to analyze all their data nor provide timely calculations to regulators.

Additionally, advanced data analytics is playing an increasingly important role in risk-related use cases like AML, KYC, and fraud prevention. These use cases rely on anomaly detection through massive datasets to find a needle in a haystack. Machine learning (ML) enables risk teams to be more effective by reducing false positives and moving beyond rules-based detection. Unfortunately, traditional data warehouses lack the ML capabilities needed to deliver on these needs. Nor can they scale for the billions of transactions that need to be analyzed to power these predictions. Bolt-on solutions for advanced analytics require data to be copied across platforms, leading to data inconsistencies and slow time to insights.

A modern data architecture

With the advent of FRTB and other regulations, data and compliance teams will find themselves considering a modern architecture when looking to take a data-driven approach to risk and compliance. What will be important is to have platform is built in the cloud to provide institutions with the elastic scale they need to analyze massive volumes of data for risk and compliance purposes. A modern system that can process petabytes of batch and streaming data in near real time is needed, which may not always be possible on a data lake or warehouse. Teams need to scale simulations for millions of scenarios across their portfolios to help mitigate risk. Intraday and real-time reporting on controls for CCAR, and FRTB, and other regulations become possible.

Fraud and AML detection is a big component of regulatory compliance that involves anomaly detection. As mentioned earlier, anomaly detection identify malicious activity hidden in mass transaction data. For anomaly detection at scale, voluminous datasets are ingested and processed, FIs need to perform advanced analytics and AI-driven monitoring. This allows FSIs looking at thousands or billions of transactions to detect anomalies, new, unknown patterns and threats.

With advanced analytics, FIs can also correlate isolated signals from threats, and therefore, reduce false positives while improving the quality of alerts so they can focus on relevant, high-risk fraud, AML, KYC and compliance cases. Additionally, the data and AI allows teams to automate repetitive compliance tasks and augment intel for investigations, on massive and changing datasets with AI to focus on high-risk cases to better predict risky events and drive agility within the compliance team.

Risk and compliance teams need an architecture that cuts through all the complexities of ingesting and processing millions of data points to implement anomaly detection at scale -- this lends itself well to fraud prevention. This enables teams to move from rules to machine learning to respond fast and reduce operational costs associated with fraud.

Delta Lake and scale

We discussed an architecture that resembles a Lakehouse paradigm. What many modern FIs are using is a Delta Lake- an open-source data management layer that simplifies all aspects of data management for ML. Delta Lake ingests and processes data with reliability and performance at scale, giving the Lakehouse the ability to scale in principle unlimited data sets rapidly. The lakehouse and Delta engine together provide a robust data foundation for ETL and advanced analytics for creating compliance applications in an elastic computing environment. Delta Lake provides advanced analytics in addition to data ETL-- enabling ML and AI on the platform. Scalable analytics and AI power compliance systems to detect and learn new patterns to help streamline compliance alert systems to near-perfection, addressing the issue of false positives. An AI system can automate repetitive tasks and can be engineered to detect anomalies and patterns that you’re not looking for -- achieving more accuracy and predicting threats before they occur. For example, it can prevent two analysts from investigating the same two alerts that are part of the same threat (contextualizing incident and correlating isolated signals) to reduce the amount of work and improve detection.

Financial institutions are increasingly reporting that current data systems for compliance cannot perform advanced analytics in a live setting that requires scale. The Lakehouse architecture can help simplify and build scalable risk and compliance solutions within a highly regulated environment. FINRA uses the Lakehouse platform to deter misconduct by enforcing rules, detecting and preventing wrongdoing in the U.S. capital markets. With the Lakehouse, FINRA can quickly iterate on ML models and scale detection efforts to 100’s of billions of market events per day on a unified platform.

Learn more about how to modernize compliance on our Smarter risk and compliance with data and AI hub.