Graduated and worked in eBay’s Data Platform org for about 6 years. Have experience on both streaming and batch compute system. Now mainly focus on the performance optimization of SparkSQL.
November 18, 2020 04:00 PM PT
Problems we met after we enabled dpp on production for many customers' interactive sqls, how we enhance and solve it. What extra work we do on dpp to make it better filter ratio on our cases since scan large amount of data is becoming the major bottleneck of our loaded cluster serving interactive queries. Our design and work on runtime filter... And how we benefit from above work.
Speaker: Xiaoju Wu