Scalable Partition Handling for Cloud-Native Architecture in Apache Spark 2.1
Apache Spark 2.1 is just around the corner: the community is going through voting process for the release candidates. This blog post discusses one of the most important features in the upcoming release: scalable partition handling. Spark SQL lets you query terabytes of data with a single job. Often times though, users only want to...