Learn how Pure Storage engineering manages streaming 190B log events per day and makes use of that deluge of data in our continuous integration (CI) pipeline. Our test infrastructure runs over 70,000 tests per day creating a large triage problem that would require at least 20 triage engineers. Instead, Spark’s flexible computing platform allows us to write a single application for both streaming and batch jobs to understand the state of our CI pipeline for our team of 3 triage engineers. Using encoded patterns, Spark indexes log data for real-time reporting (Streaming), uses Machine Learning for performance modeling and prediction (Batch job), and finds previous matches for newly encoded patterns (Batch job).
Resource allocation in this mixed environment can be challenging; a containerized Spark cluster deployment, and disaggregated compute and storage layers allow us to programmatically shift compute resources between the streaming and batch applications.. This talk will go over design decisions to meet SLAs of streaming and batching in hardware, data layout, access patterns, and containers strategy. We will also go over the challenges, lessons learned, and best practices for this kind of setup.”
Session hashtag: #SAISEnt11
Joshua Robinson is a Founding Engineer on the FlashBlade team and is currently a field data scientist, developing architectures for advanced analytics and AI systems. He spent 3.5 years on the core development team architecting and building the FlashBlade from the ground-up. Prior to Pure, Joshua worked as a data scientist in the search infrastructure team at Google, building and running data pipelines and machine-learning algorithms for indexing the Internet. Joshua graduated with a PhD in Electrical and Computer Engineering from Rice University in 2009 with a focus on machine learning and algorithms.