An introduction to JSON support in Spark SQL

Note: Starting Spark 1.3, SchemaRDD will be renamed to DataFrame.

In this blog post, we introduce Spark SQL’s JSON support, a feature we have been working on at Databricks to make it dramatically easier to query and create JSON data in Spark. With the prevalence of web and mobile applications, JSON has become the de-facto interchange format for web service API’s as well as long-term storage. With existing tools, users often engineer complex pipelines to read and write JSON data sets within analytical systems. Spark SQL’s JSON support, released in Apache Spark 1.1 and enhanced in Apache Spark 1.2, vastly simplifies the end-to-end-experience of working with JSON data.

Sign up