Quality Assurance for BigData Processes Using Databricks and FINRA’s MegaSparkDiff

“At FINRA, we process an average of 37billion market events per trade day. Such trades are analyzed using a series of systems. Validating those systems then becomes a unique challenge due to the large volume of their content and visualizing the differences in a user-friendly way. We’ll start by a quick primer on financial markets, a brief on our architecture in the cloud, and how this architecture or process addresses it.

Additionally, we will explore our open source MegaSparkDiff and explain how we address quality assurance for big data processing. Through code examples in notebooks, we will demonstrate a sample big data process with assurance coverage using Databricks..

The talk will demonstrate how a QA engineer assures the systems using a notebook environment along with code samples.

Take aways:
1. Fully functioning notebook with Visual Diff Analysis code samples.
2. MegaSparkDiff codebase on GitHub”

« back
About Ahmed Ibrahim

Ahmed Ibrahim is a Principal Architect of Technology in assurance engineering. In his role, Ahmed is responsible for the architecture of assurance processes for market surveillance projects. His work relates to big data analytics and machine learning solutions. In his previous work, Ahmed designed enterprise systems for multiple international trade facilitation systems, and national security systems. His interests include graph analysis and visual intelligence led investigation techniques, Ahmed received a B.Sc. in Electrical Computer Engineering in 2001.