Matteo Merli is currently a Senior Principal Engineer at Splunk and one of the original authors of Apache Pulsar. At Splunk, he is focused on integrating Pulsar in various products. Before Splunk, he was one of the co-founders of Streamlio that was acquired by Splunk. He currently serves as the VP of PMC chair for Apache Pulsar and also a member of the Apache BookKeeper PMC. Previously, he spent several years at Yahoo building database replication systems and multi-tenant messaging platforms.
June 25, 2020 05:00 PM PT
Apache Pulsar is the next generation messaging and queuing system with unique design trade-offs driven by the need for scalability and durability. Its two layered architecture of separating message storage from serving led to an implementation that unifies the flexibility and the high-level constructs of messaging, queuing and light weight computing with the scalable properties of log storage systems. This allows Apache Pulsar to be dynamically scaled up or down without any downtime. Using Apache BookKeeper as the underlying data storage, Pulsar guarantees data consistency and durability while maintaining strict SLAs for throughput and latency. Furthermore, Apache Pulsar integrates Pulsar Functions, a lambda style framework to write serverless functions to natively process data immediately upon arrival. This serverless stream processing approach is ideal for lightweight processing tasks like filtering, data routing and transformations. In this talk, we will give an overview about Apache Pulsar and delve into its unique architecture on messaging, storage and serverless data processing. We will also describe how Apache Pulsar is deployed in use case scenarios and explain how end-to-end streaming applications are written using Pulsar.