We recently announced the release of Delta Lake 0.8.0 , which introduces schema evolution and performance improvements in merge and operational metrics in...
Databricks is thrilled to announce our new optimized autoscaling feature. The new Apache Spark™-aware resource manager leverages Spark shuffle and executor statistics to...
Import this notebook on Databricks Structured Streaming in Apache Spark 2.0 decoupled micro-batch processing from its high-level APIs for a couple of reasons...
This is the fourth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark. Continuous applications often...
Many complex stream processing pipelines must maintain state across a period of time. For example, if you are interested in understanding user behavior...
Earlier, we presented new visualizations introduced in Apache Spark 1.4.0 to understand the behavior of Spark applications. Continuing the theme, this blog highlights...