Pivoting Data with SparkSQL - Databricks

Pivoting Data with SparkSQL

Download Slides

Pivot tables are an essential part of data analysis and reporting. A pivot can be thought of as translating rows into columns while applying one or more aggregations. Many popular data manipulation tools (pandas, reshape2, and Excel) and databases (MS SQL and Oracle 11g) include the ability to pivot data. Now with the release of Spark 1.6 pivot is a part of the DataFrame API. We discuss how to use it and go over real world examples.

Learn more:

  • Reshaping Data with Pivot in Apache Spark
  • Five Spark SQL Utility Functions to Extract and Explore Complex Data Types