Skip to main content
<
Page 182
>

Do your Streaming ETL at Scale with Apache Spark’s Structured Streaming

September 1, 2017 by Tathagata Das in
At the Spark Summit in San Francisco in June , we announced that Apache Spark’s Structured Streaming is marked as production-ready and shared...

Cost Based Optimizer in Apache Spark 2.2

This is a joint engineering effort between Databricks’ Apache Spark engineering team (Sameer Agarwal and Wenchen Fan) and Huawei’s engineering team (Ron Hu...

Developing Custom Machine Learning Algorithms in PySpark

August 30, 2017 by Ajay Saini and Joseph Bradley in
Developing custom Machine Learning (ML) algorithms in PySpark—the Python API for Apache Spark—can be challenging and laborious. In this blog post, we describe...

Bay Area Apache Spark Meetup at Pinterest Summary

August 28, 2017 by Jules Damji in
On August 22, we held our monthly Bay Area Apache Spark Meetup (BASM) at Pinterest in San Francisco. In all, we had three...

Anthology of Technical Assets on Apache Spark's Structured Streaming

August 24, 2017 by Jules Damji in
Older anthologies collated a collection of contributions from various authors around a theme—bounded then as a journal or periodical. Newer anthologies, however, include...

Best Practices for Coarse Grained Data Security in Databricks

August 23, 2017 by Bill Chambers and Jules Damji in
At Databricks, we work with hundreds of companies, all pushing the bleeding edge in their respective industries. We want to share patterns for...

Getting Around “Moore’s Wall”: Databricks CEO Ali Ghodsi Strives to Make AI More Accessible to the Fortune 2000

August 22, 2017 by Battery Ventures in
Today Databricks, a high-profile provider of technology fueling artificial-intelligence and data-analysis breakthroughs at big companies, announced it has raised $140 million from a...

On-Demand Webinar and FAQ: Parallelize R Code Using Apache Spark

August 21, 2017 by Hossein Falaki and Jules Damji in
On August 15th, Data Science Central hosted a live webinar—Parallelize R Code Using Apache Spark—with Databricks’ Hossein Falaki . This webinar introduced SparkR...

Apache Spark’s Structured Streaming with Amazon Kinesis on Databricks

August 9, 2017 by Jules Damji in
On July 11, 2017, we announced the general availability of Apache Spark 2.2.0 as part of Databricks Runtime 3.0 (DBR) for the Unified...

Databricks Named as a Strong Performer in The Forrester Wave: Insight Platforms-as-a-Service, Q3 2017

August 8, 2017 by Bharath Gowda in
Forrester recently published The Forrester Wave: Insight Platforms-as-a-Service Wave, Q3 2017 . In its 36-criteria evaluation of insight platform-as-a-service (PaaS) providers, Forrester identified...