Company | Databricks Blog

Page 63

Databricks Serverless: Next Generation Resource Management for Apache Spark

June 7, 2017 by Greg Owen, Eric Liang, Prakash Chockalingam and Srinath Shankar in Product

As the amount of data in an organization grows, more and more engineers, analysts and data scientists need to analyze this data using...

Sharing Knowledge with the Community in a Preview of Apache Spark: The Definitive Guide

June 5, 2017 by Bill Chambers and Matei Zaharia in Announcements

Apache Spark has seen immense growth over the past several years. The size and scale of this Spark Summit is a true reflection...

Integrating Apache Spark with Cucumber for Behavioral-Driven Development

June 2, 2017 by Aaron Colcord and Zachary Nanfelt in Company

This is a guest blog from FIS Global One of the most difficult scenarios in data processing is ensuring that the data is...

It's Almost Time for Spark Summit 2017 in San Francisco

June 1, 2017 by Scott Walent in Company

Get ready! In less than two weeks, thousands of developers, data scientists, analysts, researchers and business executives from around the world will gather...

Apache Spark Cluster Monitoring with Databricks and Datadog

June 1, 2017 by Caryl Yuhas and Ilan Rabinovitch in Company

This blog post is a joint effort between Caryl Yuhas, Databricks’ Solutions Architect, and Ilan Rabinovitch, Datadog’s ‎Director of Technical Community and Evangelism...

Top 5 Reasons for Choosing S3 over HDFS

May 31, 2017 by Reynold Xin, Josh Rosen and Kyle Pistor in Company

At Databricks, our engineers guide thousands of organizations to define their big data and cloud strategies. When migrating big data workloads to the...

Bay Area Apache Spark Meetup Summary

May 26, 2017 by Jules Damji in Company

On May 16, we held our monthly Bay Area Apache Spark Meetup (BASM) at SalesforceIQ in Palo Alto. In all, we had three...

Working with Nested Data Using Higher Order Functions in SQL on Databricks

May 24, 2017 by Herman van Hövell and Bill Chambers in Product

View this notebook on Databricks Nested data types offer Databricks customers and Apache Spark users powerful ways to manipulate structured data. In particular...

Databricks Runtime 3.0 Beta Delivers Cloud Optimized Apache Spark

May 24, 2017 by Reynold Xin in Product

A major value Databricks provides is the automatic provisioning, configuration, and tuning of clusters of machines that process data. Running on these machines...

Persistent Clusters: Simplifying Cluster Management for Analytics

May 19, 2017 by Evan Ye, Haogang Chen, Henry Davidge and Prakash Chockalingam in Company

Today we are excited to announce persistent clusters for analytics in Databricks. With persistent clusters, users no longer need to go through the...