If you have a parent child relationship or a many to many relationship in your data model you will want to learn about nested dataset functionality in Spark. Ted Malaska (co-author of Hadoop Application Architecture) will walk through why nested types may change your life in solving common problems like large joins and even cartesian joins. This talk will include a full code example of create nested tables with Spark SQL, populating them those tables, and finally accessing them through a number of ways.
Ted has seen the world of data from helping out hundreds of different companies while serving as a Printable Solutions Architect at Cloudera to multiple years at the leading game company Blizzard building out data pipelines, and managing data engineering efforts. Now Ted, servers as a Directory of Enterprise Architecture at Capital One, solving data problems at every level of the company.