Evolving the Databricks brand
Some brands start out as, well, brands. A lot of work goes into the concept and painting the picture before the business is ever launched. Databricks is different. It always has been and always will be an engineering-led company. Databricks’ model for innovation is inspired by the open-source community. This is where our roots run...
Introducing Glow: An Open-Source Toolkit for Large-Scale Genomic Analysis
The key to solving some of today’s most challenging medical problems lies in the analysis of genomics data. Understanding the impact of the minor changes in an individual’s genome on their overall health is fundamentally a data driven challenge that requires integration across hundreds of thousands of individuals. By analyzing genomes across large cohorts, researchers...
Accelerating Discovery with a Unified Analytics Platform for Genomics
Today we are proud to introduce the Databricks Unified Analytics Platform for Genomics. With a unified platform for genomic data processing, tertiary analytics, and machine learning at massive scale, healthcare and life sciences organizations can accelerate the discovery of life changing treatments and further advancements in personalized and preventative care. The Genomic Data Explosion The...
Databricks Community Edition is now Generally Available
We are excited to announce the General Availability (GA) of Databricks Community Edition (DCE). As a free version of the Databricks service, DCE enables everyone to learn and explore Apache Spark, by providing access to a simple, integrated development environment for data analysts, data scientists and engineers with high quality training materials and sample application...
New Content in Databricks Community Edition
At the Spark Summit New York, we announced Databricks Community Edition (CE) beta. CE is a free version of the Databricks service that allows everyone to learn and explore Apache Spark by providing a simple, integrated development environment for data scientists and engineers with high quality training materials and sample applications. The community interest in...
Introducing Databricks Community Edition: Apache Spark for All
As developers at heart, we at Databricks are committed to the development of Apache Spark and the continued growth of the community. Today we took another step towards delivering on that goal with the beta release of Databricks Community Edition, a free version of our cloud-based Spark platform. You can read the press release here,...
Databricks is now Generally Available
We are excited to announce today, at Spark Summit 2015, the general availability of the Databricks – a hosted data platform from the team that created Apache Spark. With Databricks, you can effortlessly launch Spark clusters, explore data interactively, run production jobs, and connect third-party applications. We believe Databricks is the easiest way to use...
Analyzing Apache Access Logs with Databricks
Databricks provides a powerful platform to process, analyze, and visualize big and small data in one place. In this blog, we will illustrate how to analyze access logs of an Apache HTTP web server using Notebooks. Notebooks allow users to write and run arbitrary Apache Spark code and interactively visualize the results. Currently, notebooks support three...
The Easiest Way to Run Apache Spark Jobs
Recently, Databricks added a new feature, Jobs, to our cloud service. You can find a detailed overview of this feature here. This feature allows one to programmatically run Apache Spark jobs on Amazon’s EC2 easier than ever before. In this blog, I will provide a quick tour of this feature. What is a Job? The...
Databricks: Making Big Data Easy
Our vision at Databricks is to make big data easy so that we enable every organization to turn its data into value. At Spark Summit 2014, we were very excited to unveil Databricks, our first product towards fulfilling this vision. In this post, I’ll briefly go over the challenges that data scientists and data engineers...