Jack Gudenkauf has thirty years of experience designing and implementing Internet scale distributed systems. He is currently a Senior Architect with HPE working on Parallel Streaming Transformation Loader (PSTL) systems for Big Data customers. He was previously CEO and founder of BigDataInfra.com and VP of Big Data at Playtika, the social casino global gaming category leader.Jack was a hands-on manager of the Analytics Data Warehouse team at Twitter, responsible for the infrastructure and tools loading data from/to HDFS, HP-Vertica, MySQL, and other data sources. Prior to Twitter, Jack spent 15 years at Microsoft where he shipped 15 products.
Organizations building big data analytics solutions for streaming environments struggle with adapting legacy batch systems for streaming, supporting multiple columnar analytical databases, providing time series aggregations, and streaming Fact and Dimensional data into star schemas. In this session, you will learn how we overcame these challenges and developed an end-user self-service, no-code required “ETL” framework. Extensible and operationally robust, this developer framework includes a Spark Structured Streaming app for Kafka, Hadoop/Hive (ORC, Parquet), OpenTSDB/HBase, and Vertica data pipelines.