Benchmarking Structured Streaming on Databricks Runtime Against State-of-the-Art Streaming SystemsOctober 11, 2017 by Burak Yavuz in Engineering Blog Update Dec 14, 2017 : As a result of a fix in the toolkit’s data generator, Apache Flink's performance on a cluster of...
Accelerating R Workflows on DatabricksOctober 6, 2017 by Hossein Falaki in Engineering Blog At Databricks we strive to make our Unified Analytics Platform the best place to run big data analytics. For big data, Apache Spark...
Building Complex Data Pipelines with Unified Analytics PlatformOctober 5, 2017 by Jules Damji and Jason Pohl in Platform Blog Introduction Big data practitioners often post recurring questions on Quora: What is data engineering? How to become a data scientist? What’s a data...
Bay Area Apache Spark Meetup at HPE/Aruba Networks SummarySeptember 22, 2017 by Jules Damji in Company Blog On September 7th, we held our monthly Bay Area Apache Spark Meetup (BASM) at HPE/Aruba Networks in Santa Clara. We had two Apache...
Learn about Apache Spark’s Memory Model and Spark’s State in the CloudSeptember 19, 2017 by Wenchen Fan and Nicolas Poggi in Company Blog Since Apache Spark 1.6, as part of the Project Tungsten , we started an ongoing effort to substantially improve the memory and CPU...
Databricks invites Colleen Lewis to Speak about Diversity in the WorkplaceSeptember 15, 2017 by Angelos Mikelatos in Announcements First I'll start with the sad truth. The technology industry at large has taken many hits over the years for discriminatory practices and...
Looker and Databricks Partner to Bring Data Scientists and Business Users TogetherSeptember 14, 2017 by Brian Dirking in Company Blog We are very excited today as we announce a partnership between Databricks and Looker. We have seen customers using these products together to...
Learn about Apache Spark APIs and Best PracticesSeptember 12, 2017 by Jules Damji and Silvio Fiorito in Company Blog Since Apache Spark 1.3, Spark and its APIs have evolved to make them easier, faster, and smarter. The goal has been to unify...
Build, Scale, and Deploy Deep Learning Pipelines with EaseSeptember 6, 2017 by Sue Ann Hong and Tim Hunter in Announcements At the Spark Summit in San Francisco in June , we announced an open-source project Deep Learning Pipelines . Deep Learning Pipelines provides...
A Summer of Personal and Professional Growth at DatabricksSeptember 5, 2017 by Karen Feng in Company Blog This summer, I worked at Databricks as a software engineering intern on the Growth team. By introducing two new features, user groups and...