Skip to main content
Page 1
Engineering blog

Shiny and Environments for R Notebooks

At Databricks, we want the Lakehouse ecosystem widely accessible to all data practitioners, and R is a great interface language for this purpose...
Engineering blog

Introducing the Databricks Web Terminal

Introduction We're excited to introduce the public preview of the Databricks Web Terminal in the 3.25 platform release. Any user with "Can Attach...
Engineering blog

%tensorboard - a new way to use TensorBoard on Databricks

August 25, 2020 by Jerry Liang and Hossein Falaki in Engineering Blog
Introduction With the Databricks Runtime 7.2 release , we are introducing a new magic command %tensorboard . This brings the interactive TensorBoard experience...
Engineering blog

Developing Shiny Applications in Databricks

March 9, 2020 by Yifan Cao and Hossein Falaki in Engineering Blog
Join our live webinar hosted by Data Science Central on March 12 to learn more We are excited to announce that you can...
Engineering blog

Introducing Databricks Runtime 5.4 with Conda (Beta)

We are excited to introduce a new runtime: Databricks Runtime 5.4 with Conda (Beta). This runtime uses Conda to manage Python libraries and...
Engineering blog

Accelerating Machine Learning on Databricks: On-Demand Webinar and FAQ Now Available!

February 4, 2019 by Hossein Falaki and Adam Conway in Engineering Blog
Try this notebook in Databricks On January 15th, we hosted a live webinar— Accelerating Machine Learning on Databricks —with Adam Conway, VP of...
Engineering blog

Introducing Databricks Runtime 5.1 for Machine Learning

Last week, we released Databricks Runtime 5.1 Beta for Machine Learning. As part of our commitment to provide developers with the latest deep...
Engineering blog

Introducing Databricks Runtime 5.0 for Machine Learning

Six months ago we introduced the Databricks Runtime for Machine Learning with the goal of making machine learning performant and easy on the...
Engineering blog

100x Faster Bridge between Apache Spark and R with User-Defined Functions on Databricks

August 15, 2018 by Liang Zhang and Hossein Falaki in Engineering Blog
SparkR User-Defined Function (UDF) API opens up opportunities for big data workloads running on Apache Spark to embrace R's rich package ecosystem. Some...
Company blog

Sharing R Notebooks using RMarkdown

July 6, 2018 by Hanyu Cui and Hossein Falaki in Company Blog
At Databricks, we are thrilled to announce the integration of RStudio with the Databricks Unified Analytics Platform. You can try it out now...
Platform blog

Announcing RStudio and Databricks Integration

At Databricks, we are thrilled to announce the integration of RStudio with the Databricks Unified Analytics Platform. You can try it out now...
Engineering blog

Accelerating R Workflows on Databricks

October 6, 2017 by Hossein Falaki in Engineering Blog
At Databricks we strive to make our Unified Analytics Platform the best place to run big data analytics. For big data, Apache Spark...
Engineering blog

On-Demand Webinar and FAQ: Parallelize R Code Using Apache Spark

August 21, 2017 by Hossein Falaki and Jules Damji in Engineering Blog
On August 15th, Data Science Central hosted a live webinar—Parallelize R Code Using Apache Spark—with Databricks’ Hossein Falaki . This webinar introduced SparkR...
Company blog

Shell Oil Use Case: Parallelizing Large Simulations with Apache SparkR on Databricks

This blog post is a joint engineering effort between Shell’s Data Science Team ( Wayne W. Jones and Dennis Vallinga ) and Databricks...
Engineering blog

Using sparklyr in Databricks

May 25, 2017 by Hossein Falaki in Engineering Blog
Try this notebook on Databricks with all instructions as explained in this post notebook In September 2016, RStudio announced sparklyr , a new...
Engineering blog

SparkR Tutorial at useR 2016

AMPLab and Databricks gave a tutorial on SparkR at the useR conference. The conference was held from June 27 - June 30 at...
Engineering blog

Approximate Algorithms in Apache Spark: HyperLogLog and Quantiles

Introduction Apache Spark is fast, but applications such as preliminary data exploration need to be even faster and are willing to sacrifice some...
Company blog

Introducing R Notebooks in Databricks

July 13, 2015 by Hossein Falaki in Company Blog
Apache Spark 1.4 was released on June 11 and one of the exciting new features was SparkR . I am happy to announce...
Engineering blog

Statistics Functionality in Apache Spark 1.1

One of our philosophies in Apache Spark is to provide rich and friendly built-in libraries so that users can easily assemble data pipelines. With Spark, and MLlib in particular, quickly gaining traction among data scientists and machine learning practitioners, we’re observing a growing demand for data analysis support outside of model fitting. To address this need, we have started to add scalable implementations of common statistical functions to facilitate v