Heng Zhang

Developer, Salesforce

I am a software engineer who is interested and specialized in micro services, distributed systems and big data

Past sessions

Summit 2021 A High Performance Mutable Engagement Activity Delta Lake

May 26, 2021 05:00 PM PT

In Salesforce, our customers are using High Velocity Sales to intelligently convert leads and create new opportunities. To support it, we built the engagement activity platform to automatically capture and store user engagement activities using delta lake, which is one of the key components supporting Einstein Analytics for creating powerful reports and dashboards and Sales Cloud Einstein for training machine learning models. 

To convert leads and create new opportunities requires our engagement activity delta lake to handle data mutations at scale. In this presentation, we will share the challenges and learnings from building a high performance mutable data lake using delta lake which will include:

  • Independent Stream Process to Support Engagement Data Life cycle
  • Downstream Incremental Read
  • High Throughput Transactions in Engagement ID Mutation
    • Detect Cascading ID Mutation with Graph
    • Data Skipping and Z-Order with I/O Pruning
  • High Data Consistency and Integrity
    • Exact Once Write Across Tables
    • Global Synchronization and Ordering
In this session watch:
Heng Zhang, Developer, Salesforce
Zhidong Ke, Software Engineer, Salesforce

[daisna21-sessions-od]

Heng Zhang