Skip to main content
<
Page 185
>

Entropy-based Log Redaction for Apache Spark on Databricks

May 30, 2017 by Weiluo Ren and Yu Peng in
This blog post is part of our series of internal engineering blogs on Databricks platform, infrastructure management, tooling, monitoring, and provisioning. We love...

Bay Area Apache Spark Meetup Summary

May 26, 2017 by Jules Damji in
On May 16, we held our monthly Bay Area Apache Spark Meetup (BASM) at SalesforceIQ in Palo Alto. In all, we had three...

Using sparklyr in Databricks

May 25, 2017 by Hossein Falaki in
Try this notebook on Databricks with all instructions as explained in this post notebook In September 2016, RStudio announced sparklyr , a new...

Working with Nested Data Using Higher Order Functions in SQL on Databricks

View this notebook on Databricks Nested data types offer Databricks customers and Apache Spark users powerful ways to manipulate structured data. In particular...

Databricks Runtime 3.0 Beta Delivers Cloud Optimized Apache Spark

May 24, 2017 by Reynold Xin in
A major value Databricks provides is the automatic provisioning, configuration, and tuning of clusters of machines that process data. Running on these machines...

On-Demand Webinar and FAQ: Deep Learning and Apache Spark: Workflows and Best Practices

May 23, 2017 by Tim Hunter and Jules Damji in
On May 4th, we hosted a live webinar — Deep Learning and Apache Spark: Workflows and Best Practices . Rather than comparing deep...

Running Streaming Jobs Once a Day For 10x Cost Savings

This is the sixth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark. Traditionally, when people...

Persistent Clusters: Simplifying Cluster Management for Analytics

Today we are excited to announce persistent clusters for analytics in Databricks. With persistent clusters, users no longer need to go through the...

Taking Apache Spark’s Structured Streaming to Production

This is the fifth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark. At Databricks, we’ve...

Detecting Abuse at Scale: Locality Sensitive Hashing at Uber Engineering

May 9, 2017 by Yun Ni, Kelvin Chu and Joseph Bradley in
This is a cross blog post effort between Databricks and Uber Engineering. Yun Ni is a software engineer on Uber’s Machine Learning Platform...