At OLX we produce daily about 50 millions messages to be delivered to our 300+ millions users across the globe; via email, sms or push. The majority of these notifications relies on the processing of the billions of events generated by our web and mobile platforms to understand the users behaviour and to craft relevant messages designed to influence the customer journey positively.
The initial approach was to process events in bulk, in a daily batch data-warehouse-y fashion. This worked well for a while, but requirements changed: to be more relevant and effective, we had to react faster to user actions and once-a-day processes didn’t work anymore. On top of it, data regulations, reliability and infrastructure costs urged us to step-up our game and improve the way we engineer data.
In this presentation I will discuss the approach, challenges and learnings of migrating our notification platform from a monolithic, batch system based on AWS Redshift, SQL and ETL pipelines to a micro-service, real-time system developed with Apache Spark and Python.
Session hashtag: #SAISExp5
Italian, early 30s. I started my career in Italy working as a Linux SysAdmin and Oracle DBA for a large manufacturing company and one of the biggest Italian banking groups. I then moved to the UK where I joined Xerox as a BI and Data Engineer focused on Customer Care Innovation and Automation. In late 2017, I joined the OLX Group Berlin Tech Hub as a Data Engineer focusing on Customer Journey and Experience Innovation through relevant customer communications. Lifelong learner. Passionate about Data, Music, Food and Travels. Techs: Spark, Python, R, SQL, AWS, Linux, Redshift, Oracle, Tableau