Session

Master Schema Translations in the Era of Open Data Lake

Overview

ExperienceIn Person
TypeLightning Talk
TrackData and AI Governance
IndustryProfessional Services, Travel and Hospitality, Financial Services
TechnologiesDelta Lake, Databricks SQL, Unity Catalog
Skill LevelIntermediate
Duration20 min

Unity Catalog puts variety of schemas into a centralized repository, now the developer community wants more productivity and automation for schema inference, translation, evolution and optimization especially for the scenarios of ingestion and reverse-ETL with more code generations.Coinbase Data Platform attempts to pave a path with "Schemaster" to interact with data catalog with the (proposed) metadata model to make schema translation and evolution more manageable across some of the popular systems, such as Delta, Iceberg, Snowflake, Kafka, MongoDB, DynamoDB, Postgres...This Lighting Talk covers 4 areas:

  • The complexity and caveats of schema differences among 
  • The proposed field-level metadata model, and 2 translation patterns: point-to-point vs hub-and-spoke
  • Why Data Profiling be augmented to enhance schema understanding and translation
  • Integrate it with Ingestion & Reverse-ETL in a Databricks-oriented eco system

 

Takeaway: standardize schema lineage & translation 

Session Speakers

Eric Sun

/Head of Data Platform
Coinbase