Junping Du is chief architect for Tencent Cloud Big Data Department and responsible for cloud data warehouse engineering team. As Apache Hadoop Committer/PMC member, he serves as release manager of Hadoop 2.6.x and 2.8.x for community. Junping has more than 10 years industry experiences in big data and cloud area. Before joining Tencent, he was YARN team lead at Hortonworks. Prior to Hortonworks, he worked as tech lead for vHadoop and Big Data Extension at VMware.
Recently, a set of modern table formats such as Delta Lake, Hudi, Iceberg spring out. Along with Hive Metastore these table formats are trying to solve problems that stand in traditional data lake for a long time with their declared features like ACID, schema evolution, upsert, time travel, incremental consumption etc. This talk will share the research that we did for the comparison about the key features and design these table format holds, the maturity of features, such as APIs expose to end user, how to work with compute engines and finally a comprehensive benchmark about transaction, upsert and mass partitions will be shared as references to audiences.