Today, we are happy to announce Apache Spark Packages ( http://spark-packages.org ), a community package index to track the growing number of open source packages and libraries that work with Apache Spark. Spark Packages makes it easy for users to find, discuss, rate, and install packages for any version of Spark, and makes it easy for developers to contribute packages.
Our friends at Twitter have contributed to MLlib, and this post uses material from Twitter’s description of its open-source contribution , with permission...
This is a guest post by Nick Pentreath of Graphflow and Kan Zhang of IBM , who contributed Python input/output format support to Apache Spark 1.1. Two powerful features of Apache Spark include its native APIs provided in Scala, Java and Python, and its compatibility with any Hadoop-based input or output source. This language support means that users can quickly become proficient in the use of Spark even without experience in Scala, and furthermore can leverag