Skip to main content
<
Page 10
>

The Architecture of the Next CERN Accelerator Logging Service

December 14, 2017 by Jakub Wozniak in
This is a community guest blog from Jakub Wozniak , a software engineer and project technical lead at CERN physics laboratory, further expounding...

Arbitrary Stateful Processing in Apache Spark’s Structured Streaming

October 17, 2017 by Bill Chambers and Jules Damji in
This is the seventh post in a multi-part series about how you can perform complex streaming analytics using Apache Spark and Structured Streaming...

Benchmarking Structured Streaming on Databricks Runtime Against State-of-the-Art Streaming Systems

October 11, 2017 by Burak Yavuz in
Update Dec 14, 2017 : As a result of a fix in the toolkit’s data generator, Apache Flink's performance on a cluster of...

Building Complex Data Pipelines with Unified Analytics Platform

October 5, 2017 by Jules Damji and Jason Pohl in
Introduction Big data practitioners often post recurring questions on Quora: What is data engineering? How to become a data scientist? What’s a data...

Anthology of Technical Assets on Apache Spark's Structured Streaming

August 24, 2017 by Jules Damji in
Older anthologies collated a collection of contributions from various authors around a theme—bounded then as a journal or periodical. Newer anthologies, however, include...

Five Spark SQL Utility Functions to Extract and Explore Complex Data Types

June 13, 2017 by Jules Damji in
Try this notebook on Databricks For developers, often the how is as important as the why . While our in-depth blog explains the...

Making Apache Spark the Fastest Open Source Streaming Engine

June 6, 2017 by Michael Lumb in
We started building Structured Streaming in Apache Spark one year ago as a new, simpler way to develop continuous applications . Not only...

Running Streaming Jobs Once a Day For 10x Cost Savings

This is the sixth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark. Traditionally, when people...

Taking Apache Spark’s Structured Streaming to Production

This is the fifth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark. At Databricks, we’ve...

Event-time Aggregation and Watermarking in Apache Spark’s Structured Streaming

This is the fourth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark. Continuous applications often...