Skip to main content

We are excited to announce the General Availability (GA) of Databricks Community Edition (DCE). As a free version of the Databricks service, DCE enables everyone to learn and explore Apache Spark, by providing access to a simple, integrated development environment for data analysts, data scientists and engineers with high quality training materials and sample application notebooks.

Less than four months ago, at Data + AI Summit, we introduced Databricks Community Edition (DCE) beta. Its introduction generated tremendous interest with thousands of people requesting accounts. Today, we are delighted to report that more than 8,000 users have signed on DCE, many of them using the service heavily. The top 10% active users are averaging over 6 hours per week, and are executing over 10,000 commands on average.

Going beyond these numbers, we are delighted to see DCE attracting a wide user base. According to a survey we have conducted recently, 25% of our users have never used Spark before, and 60% of the users are neither data scientists nor data engineers. This demonstrates the effectiveness of DCE to grow the open source Apache Spark user community by bringing new users into the fold, as well as its ability to train new data scientists and engineers.

The same survey also indicates that 90% of the users are using DCE for learning Apache Spark, which establishes the role of DCE as a learning platform for Spark. Indeed, since its launch, tens of universities have already used DCE for teaching, including UC Berkeley and Stanford. At UC Berkeley, over 900 students have used DCE to learn Apache Spark during the “Structure and Interpretation of Computer Programs” class in Spring, 2016.

To further aid the efforts of teaching big data with Apache Spark and to reach students worldwide, we are happy to announce that this year Databricks will offer a free 5-MOOC series (up from two MOOCs since Databricks offered last year) on EdX, all of which will be taught on DCE:

  • Introduction to Apache Spark
  • Distributed Machine Learning with Apache Spark
  • Big Data Analysis with Apache Spark
  • Advanced Apache Spark for Data Science and Engineering
  • Advanced Machine Learning with Apache Spark/li>

Also, the GA comes with new introductory materials and sample applications. In particular, to make learning Apache Spark even easier, we have added three notebooks to provide a “gentler” introduction to Apache Spark. You can find these new notebooks here:

By making DCE generally available, we are looking to fuel the growth of the community by introducing Apache Spark to first time users. Finally, by training a new generation of data scientists and engineers, we hope to mitigate the ever growing scarcity of data specialists.

Sign up for Databricks Community Edition

Try Databricks for free

Related posts

Introducing Delta Time Travel for Large Scale Data Lakes

Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake . Data versioning for...

Using Databricks to Democratize Big Data and Machine Learning at McGraw-Hill Education

October 18, 2017 by Matthew Hogan in
This is a guest post from Matt Hogan, Sr. Director of Engineering, Analytics and Reporting at McGraw-Hill Education. McGraw-Hill Education is a 129-year-old...

Upskill on the Lakehouse and Build Community Through Databricks

November 13, 2022 by Christy Seto in
The success of data practitioners are at the heart of why Databricks exists, from bringing you an innovative data, AI and analytics platform...
See all Product posts