Distributed Real-Time Stream Processing: Why and How - Databricks

Distributed Real-Time Stream Processing: Why and How

Download Slides

The demand for stream processing is increasing a lot these days. Immense amounts of data have to be processed fast from a rapidly growing set of disparate data sources. This pushes the limits of traditional data processing infrastructures. These stream-based applications include trading, social networks, Internet of things, system monitoring, and many other examples.
A number of powerful, easy-to-use open source platforms have emerged to address this. But the same problem can be solved differently, various but sometimes overlapping use-cases can be targeted or different vocabularies for similar concepts can be used. This may lead to confusion, longer development time or costly wrong decisions.

About Petr Zapletal

Petr is a Consultant who specialises in the design and implementation of highly scaleable, reactive and resilient distributed systems. He is a functional programming and open source enthusiast and has expertise in the area of big data and machine classification techniques. Technically, Petr is SMACK (Spark, Mesos, Akka, Cassandra, Kafka) evangelist. Petr enjoys working with Akka and has deep knowledge of toolkit’s features like Akka Clustering, Distributed Data or Akka Persistence. Petr is also certified Spark Developer.