Recently Infoworld unveiled the 2015 Technology of the Year Award winners, which range from open source software to stellar consumer technologies like the iPhone. Being the creators behind Apache Spark, Databricks is thrilled to see Spark in their ranks. In fact, we built our flagship product, Databricks, on top of Spark with the ambition to revolutionize big data processing in ways similar to how iPhone revolutionized the mobile experience.
The iPhone was revolutionary in a number of ways: first, it integrated a disparate set of consumer electronic capabilities such as mobile phone, camera, GPS, and even laptop; second, it created a seamless experience navigating these capabilities with iOS; lastly, it was easily extensible through 3rd party applications, which gave rise to a whole ecosystem of products built around the iPhone. In short, the iPhone is the simple and powerful platform to meet all the mobile needs of users.
Databricks is precisely the analogous product in the big data world.
First, Databricks unifies disparate functionalities: It runs 100% open source Spark, which is a lightning fast big data general processing engine that includes a broad set of standard libraries. This means a user will have access to lightning fast performance up to 100x faster than Hadoop MapReduce both in memory and on disk, and also have capabilities such as SQL, machine learning, graph processing out of the box. Users can get all the big data processing tools through Spark instead of integrate disparate tools.
Second, Databricks also includes features that makes deploying and working with Spark much simpler. The cluster manager allows users to create, modify, and teardown Spark clusters in their Amazon Virtual Private Cloud (VPC) in a few clicks. The interactive workspace allows users to easily visualize data and share results with colleagues. The job scheduler allows production data processing pipelines to be built in a matter of minutes. Through Databricks, users can have a seamless experience working with big data - whether it’s interactive exploration or batch processing.
Lastly, Databricks provides common connectors to allow 3rd party applications, such as Tableau and many of the common BI tools used by corporations today, to seamlessly integrate. This allows non-technical users to gain access to big data, and further increases the usability of Databricks.
We built Databricks to make big data simple: easy to use, work at lightning fast speed, and provide instant results for all big data processing needs. As Spark gains momentum, many are finding that Databricks is the best way to run Spark due to its rapid deployment, seamless user experience, and performance. If you would like to try out Databricks, simply sign up for a trial!