Articles by Hossein Falaki - Databricks Blog

Page 2

Announcing RStudio and Databricks Integration

June 26, 2018 by Brian Dirking, Hossein Falaki and Denny Lee in Partners

At Databricks, we are thrilled to announce the integration of RStudio with the Databricks Unified Analytics Platform. You can try it out now...

Accelerating R Workflows on Databricks

October 6, 2017 by Hossein Falaki in Engineering

At Databricks we strive to make our Unified Analytics Platform the best place to run big data analytics. For big data, Apache Spark...

On-Demand Webinar and FAQ: Parallelize R Code Using Apache Spark

August 21, 2017 by Hossein Falaki and Jules Damji in Engineering

On August 15th, Data Science Central hosted a live webinar—Parallelize R Code Using Apache Spark—with Databricks’ Hossein Falaki . This webinar introduced SparkR...

Shell Oil Use Case: Parallelizing Large Simulations with Apache SparkR on Databricks

June 23, 2017 by Wayne W. Jones, Dennis Vallinga and Hossein Falaki in Product

This blog post is a joint engineering effort between Shell’s Data Science Team ( Wayne W. Jones and Dennis Vallinga ) and Databricks...

Using sparklyr in Databricks

May 25, 2017 by Hossein Falaki in Engineering

Try this notebook on Databricks with all instructions as explained in this post notebook In September 2016, RStudio announced sparklyr , a new...

SparkR Tutorial at useR 2016

July 7, 2016 by Hossein Falaki and Shivaram Venkataraman in Solutions

AMPLab and Databricks gave a tutorial on SparkR at the useR conference. The conference was held from June 27 - June 30 at...

Approximate Algorithms in Apache Spark: HyperLogLog and Quantiles

May 19, 2016 by Tim Hunter, Hossein Falaki and Joseph Bradley in Solutions

Introduction Apache Spark is fast, but applications such as preliminary data exploration need to be even faster and are willing to sacrifice some...

Introducing R Notebooks in Databricks

July 13, 2015 by Hossein Falaki in Product

Apache Spark 1.4 was released on June 11 and one of the exciting new features was SparkR . I am happy to announce...

Statistics Functionality in Apache Spark 1.1

August 27, 2014 by Doris Xin, Burak Yavuz and Hossein Falaki in Engineering

One of our philosophies in Apache Spark is to provide rich and friendly built-in libraries so that users can easily assemble data pipelines. With Spark, and MLlib in particular, quickly gaining traction among data scientists and machine learning practitioners, we’re observing a growing demand for data analysis support outside of model fitting. To address this need, we have started to add scalable implementations of common statistical functions to facilitate v