Josef Adersberger - Databricks

Josef Adersberger

CTO, QAware GmbH

Josef Adersberger has been a software engineering fanatic for over 10 years. He studied computer science in Rosenheim and Munich and holds a doctoral degree in software engineering. He’s the founder and CTO of QAware, a German software development company, and is a lecturer at several German universities. His main area of interest is cloud computing.



Clickstream Analysis with Spark—Understanding Visitors in RealtimeSummit East 2016

Users leave thousands of traces per second on a successful ecommerce site. It's very pragmatic to analyse and react on this trace event stream in realtime. This is called clickstream analysis. In the talk I'll present a software architecture based on Apache Spark which is able to process thousands of clickstream events per second. A product based on this architecture is in production since mid 2015. The building blocks of the architecture beside Spark are Kafka to handle the inbound event stream, Spark Streaming for initial stream processing and Parquet as serialization format. I argue why we've chosen these technologies and what experiences we had in developing, launching and operating the product.

Learn more:
  • Insights into Customer Behavior from Clickstream Data
  • Diving into Apache Spark Streaming’s Execution Model