Senior Data Engineer at Tiger Analytics with experience in building advanced analytics solutions to make business driven decisions. Extensive experience in designing & implementing modern data pipelines in Cloud platform. At Tiger Analytics, my focus is on the evolvement and delivery of technology to enable large scale data processing for supporting machine learning and AI analysis on Enterprise Data Analytics Platform.
November 17, 2020 04:00 PM PT
As days goes, everything is changing, your business, your analytics platform and your data. So, Deriving the real time insights from this humongous volume of data are key for survival. This robust solution can operate you to the speed of change.
Most compelling operational analysis demands real-time rather than historical, demanding the need for ML/AI algorithm to accept real time work loads to make ever-more-accurate operational predictions.
Data from various sources is pushed to kafka using open source CDC tool (DEBEZIUM). Spark Structured Streaming reads the near real time data and sink it in Delta Lake OSS. Entire data pipeline is orchestrated with Kubernetes to run as scalable pipeline.
Speakers: Sandeep Reddy and Karthikeyan Siva Baskaran