Apache Spark on K8S Best Practice and Performance in the Cloud

Download Slides

Kubernetes As of Spark 2.3, Spark can run on clusters managed by Kubernetes. we will describes the best practices about running Spark SQL on Kubernetes upon Tencent cloud includes how to deploy Kubernetes against public cloud platform to maximum resource utilization and how to tune configurations of Spark to take advantage of Kubernetes resource manager to achieve best performance. To evaluate performance, the TPC-DS benchmarking tool will be used to analysis performance impact of queries between configurations set.

 

Essayer Databricks
See More Spark + AI Summit in San Francisco 2019 Videos


« back
About Junjie Chen

Senior Software Engineer at Tencent. Focus on big data area years, PPMC of TubeMQ, contributor of Hadoop, Spark, Hive, and Parquet.

About Junping Du

Junping Du is chief architect for Tencent Cloud Big Data Department and responsible for cloud data warehouse engineering team. As Apache Hadoop Committer/PMC member, he serves as release manager of Hadoop 2.6.x and 2.8.x for community. Junping has more than 10 years industry experiences in big data and cloud area. Before joining Tencent, he was YARN team lead at Hortonworks. Prior to Hortonworks, he worked as tech lead for vHadoop and Big Data Extension at VMware.