We are excited to announce the General Availability (GA) of Databricks Community Edition (DCE). As a free version of the Databricks service, DCE enables everyone to learn and explore Apache Spark, by providing access to a simple, integrated development environment for data analysts, data scientists and engineers with high quality training materials and sample application notebooks.
Less than four months ago, at Spark Summit New York, we introduced Databricks Community Edition (DCE) beta. Its introduction generated tremendous interest with thousands of people requesting accounts. Today, we are delighted to report that more than 8,000 users have signed on DCE, many of them using the service heavily. The top 10% active users are averaging over 6 hours per week, and are executing over 10,000 commands on average.
Going beyond these numbers, we are delighted to see DCE attracting a wide user base. According to a survey we have conducted recently, 25% of our users have never used Spark before, and 60% of the users are neither data scientists nor data engineers. This demonstrates the effectiveness of DCE to grow the open source Apache Spark user community by bringing new users into the fold, as well as its ability to train new data scientists and engineers.
The same survey also indicates that 90% of the users are using DCE for learning Apache Spark, which establishes the role of DCE as a learning platform for Spark. Indeed, since its launch, tens of universities have already used DCE for teaching, including UC Berkeley and Stanford. At UC Berkeley, over 900 students have used DCE to learn Apache Spark during the “Structure and Interpretation of Computer Programs” class in Spring, 2016.
To further aid the efforts of teaching big data with Apache Spark and to reach students worldwide, we are happy to announce that this year Databricks will offer a free 5-MOOC series (up from two MOOCs since Databricks offered last year) on EdX, all of which will be taught on DCE:
- Introduction to Apache Spark
- Distributed Machine Learning with Apache Spark
- Big Data Analysis with Apache Spark
- Advanced Apache Spark for Data Science and Engineering
- Advanced Machine Learning with Apache Spark
Also, the GA comes with new introductory materials and sample applications. In particular, to make learning Apache Spark even easier, we have added three notebooks to provide a “gentler” introduction to Apache Spark. You can find these new notebooks here:
- A Gentle Introduction to Apache Spark on Databricks
- Apache Spark on Databricks for Data Engineers
- Apache Spark on Databricks for Data Scientists
By making DCE generally available, we are looking to fuel the growth of the community by introducing Apache Spark to first time users. Finally, by training a new generation of data scientists and engineers, we hope to mitigate the ever growing scarcity of data specialists.