Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervision

Download Slides

Today there are several compliance use cases — archiving, e-discovery, supervision + surveillance, to name a few — that appear naturally suited as Hadoop workloads but haven’t seen wide adoption. In this talk, we’ll discuss common limitations, how Apache Spark helps, and propose some new blueprints as to how to modernize this architecture and disrupt existing solutions. Additionally, we’ll discuss the rising role of Apache Spark in this ecosystem; leveraging machine learning and advanced analytics in a space that has traditionally been restricted to fairly rote reporting.

About Jordan Volz

Jordan Volz is a Systems Engineer at Cloudera. He helps clients design and implement big data solutions using Cloudera's Distribution of Hadoop, across a variety of industry verticals. Previously, he has worked as a consultant for HP Autonomy delivering compliance archiving, e-Discovery, and electronic surveillance solutions to regulated financial services companies, and as a developer at Epic Systems building HIPPA-compliant EMR software.