Cyber threat detection and response requires demanding work loads over large volumes of log and telemetry data. A few years ago I came to Apple after building such a system at another FAANG company, and my boss asked me to do it again. I learned a lot from my prior experience using Apache Spark and AWS S3 at massive scale some good patterns, but also some bad patterns and pieces of technology that I wanted to avoid. That year I ran into Michael Armbrust at Spark+AI Summit and described what I wanted to do and a plan to test Databricks as a foundation for the new system. A few months later, while we were in the middle of our proof of concept build out on Databricks, Michael gave me some code they were calling Tahoe. It was the early alpha of what became Delta Lake, and it was exactly what we wanted. We have been running our entire system writing out hundreds of TB of data a day on Delta Lake since the very beginning.
This presentation will cover some of the issues we encountered and things we have learned about operating very large workloads on Databricks and Delta Lake.
Dominique Brezinski is a member of Apple's Information Security leadership team and principal engineer working with the Threat Response org. He has twenty five years experience in security engineering, with a focus on intrusion detection and incident response systems design and development. Dom has been working with Apache Spark in production since the 0.8 release.