A High Performance Mutable Engagement Activity Delta Lake

May 26, 2021 05:00 PM (PT)

Download Slides

In Salesforce, our customers are using High Velocity Sales to intelligently convert leads and create new opportunities. To support it, we built the engagement activity platform to automatically capture and store user engagement activities using delta lake, which is one of the key components supporting Einstein Analytics for creating powerful reports and dashboards and Sales Cloud Einstein for training machine learning models. 

To convert leads and create new opportunities requires our engagement activity delta lake to handle data mutations at scale. In this presentation, we will share the challenges and learnings from building a high performance mutable data lake using delta lake which will include:

  • Independent Stream Process to Support Engagement Data Life cycle
  • Downstream Incremental Read
  • High Throughput Transactions in Engagement ID Mutation
    • Detect Cascading ID Mutation with Graph
    • Data Skipping and Z-Order with I/O Pruning
  • High Data Consistency and Integrity
    • Exact Once Write Across Tables
    • Global Synchronization and Ordering
In this session watch:
Heng Zhang, Developer, Salesforce
Zhidong Ke, Software Engineer, Salesforce

 

Heng Zhang

Heng Zhang

I am a software engineer who is interested and specialized in micro services, distributed systems and big data
Read more

Zhidong Ke

I am passionate in designing distributed systems, real-time/batch data processing and building applications.
Read more