Articles by Hossein Falaki - Databricks Blog

Page 1

Shiny and Environments for R Notebooks

September 27, 2021 by Jiho Lee, Marco Ximenes Rego Monteiro, Deka Auliya Akbar and Hossein Falaki in Engineering Blog

At Databricks, we want the Lakehouse ecosystem widely accessible to all data practitioners, and R is a great interface language for this purpose...

Introducing the Databricks Web Terminal

August 31, 2020 by Hossein Falaki and Kasey Uhlenhuth in Engineering Blog

Introduction We're excited to introduce the public preview of the Databricks Web Terminal in the 3.25 platform release. Any user with "Can Attach...

%tensorboard - a new way to use TensorBoard on Databricks

August 25, 2020 by Jerry Liang and Hossein Falaki in Engineering Blog

Introduction With the Databricks Runtime 7.2 release , we are introducing a new magic command %tensorboard . This brings the interactive TensorBoard experience...

Developing Shiny Applications in Databricks

March 9, 2020 by Yifan Cao and Hossein Falaki in Engineering Blog

Join our live webinar hosted by Data Science Central on March 12 to learn more We are excited to announce that you can...

Introducing Databricks Runtime 5.4 with Conda (Beta)

June 4, 2019 by Hossein Falaki and Yifan Cao in Engineering Blog

We are excited to introduce a new runtime: Databricks Runtime 5.4 with Conda (Beta). This runtime uses Conda to manage Python libraries and...

Accelerating Machine Learning on Databricks: On-Demand Webinar and FAQ Now Available!

February 4, 2019 by Hossein Falaki and Adam Conway in Engineering Blog

Try this notebook in Databricks On January 15th, we hosted a live webinar— Accelerating Machine Learning on Databricks —with Adam Conway, VP of...

Introducing Databricks Runtime 5.1 for Machine Learning

January 7, 2019 by Hossein Falaki, Hanyu Cui and Andy Zhang in Engineering Blog

Last week, we released Databricks Runtime 5.1 Beta for Machine Learning. As part of our commitment to provide developers with the latest deep...

Introducing Databricks Runtime 5.0 for Machine Learning

November 27, 2018 by Andy Zhang, Hanyu Cui and Hossein Falaki in Engineering Blog

Six months ago we introduced the Databricks Runtime for Machine Learning with the goal of making machine learning performant and easy on the...

100x Faster Bridge between Apache Spark and R with User-Defined Functions on Databricks

August 15, 2018 by Liang Zhang and Hossein Falaki in Engineering Blog

SparkR User-Defined Function (UDF) API opens up opportunities for big data workloads running on Apache Spark to embrace R's rich package ecosystem. Some...

Sharing R Notebooks using RMarkdown

July 6, 2018 by Hanyu Cui and Hossein Falaki in Company Blog

At Databricks, we are thrilled to announce the integration of RStudio with the Databricks Unified Analytics Platform. You can try it out now...

Announcing RStudio and Databricks Integration

June 27, 2018 by Brian Dirking, Hossein Falaki and Denny Lee in Partners

At Databricks, we are thrilled to announce the integration of RStudio with the Databricks Unified Analytics Platform. You can try it out now...

Accelerating R Workflows on Databricks

October 6, 2017 by Hossein Falaki in Engineering Blog

At Databricks we strive to make our Unified Analytics Platform the best place to run big data analytics. For big data, Apache Spark...

On-Demand Webinar and FAQ: Parallelize R Code Using Apache Spark

August 21, 2017 by Hossein Falaki and Jules Damji in Engineering Blog

On August 15th, Data Science Central hosted a live webinar—Parallelize R Code Using Apache Spark—with Databricks’ Hossein Falaki . This webinar introduced SparkR...

Shell Oil Use Case: Parallelizing Large Simulations with Apache SparkR on Databricks

June 23, 2017 by Wayne W. Jones, Dennis Vallinga and Hossein Falaki in Company Blog

This blog post is a joint engineering effort between Shell’s Data Science Team ( Wayne W. Jones and Dennis Vallinga ) and Databricks...

Using sparklyr in Databricks

May 25, 2017 by Hossein Falaki in Engineering Blog

Try this notebook on Databricks with all instructions as explained in this post notebook In September 2016, RStudio announced sparklyr , a new...

SparkR Tutorial at useR 2016

July 7, 2016 by Hossein Falaki and Shivaram Venkataraman in Engineering Blog

AMPLab and Databricks gave a tutorial on SparkR at the useR conference. The conference was held from June 27 - June 30 at...

Approximate Algorithms in Apache Spark: HyperLogLog and Quantiles

May 19, 2016 by Tim Hunter, Hossein Falaki and Joseph Bradley in Engineering Blog

Introduction Apache Spark is fast, but applications such as preliminary data exploration need to be even faster and are willing to sacrifice some...

Introducing R Notebooks in Databricks

July 13, 2015 by Hossein Falaki in Company Blog

Apache Spark 1.4 was released on June 11 and one of the exciting new features was SparkR . I am happy to announce...

Statistics Functionality in Apache Spark 1.1

August 27, 2014 by Doris Xin, Burak Yavuz and Hossein Falaki in Engineering Blog

One of our philosophies in Apache Spark is to provide rich and friendly built-in libraries so that users can easily assemble data pipelines. With Spark, and MLlib in particular, quickly gaining traction among data scientists and machine learning practitioners, we’re observing a growing demand for data analysis support outside of model fitting. To address this need, we have started to add scalable implementations of common statistical functions to facilitate v