This talk is about Spark SQL performance investigations. Starting from a performance troubleshooting case from CERN database services, the discussion will move to highlighting important performance-oriented optimizations introduced in Apache Spark 2.0, in particular whole-stage code generation. The performance improvements in Spark 2.0 will be illustrated with diagnostic tools, including SQL execution plans, Linux perf stat and the use of flame graphs. Flame graphs for Spark will be discussed in details. Flame graphs visualization of stack profiles provide additional insights on which parts of the code are executed and where CPU cycles are consumed. This is of great help for performance troubleshooting and root cause analysis.