Adobe’s Unified Profile System is the heart of its Experience Platform. It ingests TBs of data a day and is PBs large. As part of this massive growth we have faced multiple challenges in our Apache Spark deployment which is used from Ingestion to Processing. We want to share some of our learnings and hard earned lessons and as we reached this scale.
Redis – Sometimes the best tool for the job is actually outside your JVM. Pipelining + Redis is a powerful combination to supercharge your data pipeline.
We will present our war stories and lessons for the above and hopefully will benefit the broader community.
I am a Project Lead/Architect on the Unified Profile Team in the Adobe Experience Platform; it's a PB scale store with a strong focus on millisecond latencies and Analytical abilities and easily one of Adobe's most challenging SaaS projects in terms of scale. I am actively designing/implementing the Interactive segmentation capabilities which helps us segment over 2 million records per second using Apache Spark. I look for opportunities to build new features using interesting data Structures and Machine Learning approaches. In a previous life, I was a ML Engineer on the Yelp Ads team building models for Snippet Optimizations.