Kris Mok

Software Engineer, Databricks

Kris Mok is a software engineer at Databricks. He works on various components of Spark SQL, with interest on optimizer and code generation. Previously, he worked on JVM implementations, including OpenJDK HotSpot VM at Alibaba and Oracle and Zing VM at Azul, and had broad interest in programming language design and implementation.

Past sessions

Summit 2019 A Deep Dive into Query Execution Engine of Spark SQL

April 24, 2019 05:00 PM PT

Spark SQL enables Spark to perform efficient and fault-tolerant relational query processing with analytics database technologies. The relational queries are compiled to the executable physical plans consisting of transformations and actions on RDDs with the generated Java code. The code is compiled to Java bytecode, executed at runtime by JVM and optimized by JIT to native machine code at runtime. This talk will take a deep dive into Spark SQL execution engine. The talk includes pipelined execution, whole-stage code generation, UDF execution, memory management, vectorized readers, lineage based RDD transformation and action.

Spark SQL enables Spark to perform efficient and fault-tolerant relational query processing with analytics database technologies. The relational queries are compiled to the executable physical plans consisting of transformations and actions on RDDs with the generated Java code. The code is compiled to Java bytecode, executed at runtime by JVM and optimized by JIT to native machine code at runtime. This talk will take a deep dive into Spark SQL execution engine. The talk includes pipelined execution, whole-stage code generation, UDF execution, memory management, vectorized readers, lineage based RDD transformation and action.