Apache XTable (incubating): Interoperability Among Lakehouse Table Formats
OVERVIEW
EXPERIENCE | In Person |
---|---|
TYPE | Breakout |
TRACK | Data Lakehouse Architecture |
INDUSTRY | Enterprise Technology |
TECHNOLOGIES | Data Sharing, Apache Spark, Delta Lake |
SKILL LEVEL | Intermediate |
DURATION | 40 min |
DOWNLOAD SESSION SLIDES |
Apache Hudi, Delta Lake and Iceberg have emerged as leading open source projects, providing decoupled storage with primitives for transaction and metadata layers, known as table formats in cloud storage. When data is written to a distributed file system, all these formats store data in open columnar formats like Parquet, with metadata for schema, commit history, partitions and column stats. Choosing a table format is challenging; each project offers unique features. Choosing a table format is challenging as each project offers unique features. Enter XTable—an OSS project that provides omnidirectional interoperability between table formats. XTable doesn't introduce a new format but provides abstractions for translating metadata. This enables writing data in any format and converting it to targets consumable by different compute engines. This session showcases XTable's solution to the challenge of format selection and interoperability in lakehouse workloads.
SESSION SPEAKERS
Kyle Weller
/Head of Product
Onehouse
Dipankar Mazumdar
/Staff Developer Advocate
Onehouse