Apache Spark the fastest open source engine for sorting a petabyte - The Databricks Blog