Session
Master Schema Translations in the Era of Open Data Lake
Overview
Experience | In Person |
---|---|
Type | Lightning Talk |
Track | Data and AI Governance |
Industry | Professional Services, Travel and Hospitality, Financial Services |
Technologies | Delta Lake, Databricks SQL, Unity Catalog |
Skill Level | Intermediate |
Duration | 20 min |
Unity Catalog puts variety of schemas into a centralized repository, now the developer community wants more productivity and automation for schema inference, translation, evolution and optimization especially for the scenarios of ingestion and reverse-ETL with more code generations.Coinbase Data Platform attempts to pave a path with "Schemaster" to interact with data catalog with the (proposed) metadata model to make schema translation and evolution more manageable across some of the popular systems, such as Delta, Iceberg, Snowflake, Kafka, MongoDB, DynamoDB, Postgres...This Lighting Talk covers 4 areas:
- The complexity and caveats of schema differences among
- The proposed field-level metadata model, and 2 translation patterns: point-to-point vs hub-and-spoke
- Why Data Profiling be augmented to enhance schema understanding and translation
- Integrate it with Ingestion & Reverse-ETL in a Databricks-oriented eco system
Takeaway: standardize schema lineage & translation
Session Speakers
Eric Sun
/Head of Data Platform
Coinbase