Skip to main content
<
Page 15
>

Introducing New Built-in and Higher-Order Functions for Complex Data Types in Apache Spark 2.4

Try this notebook in Databricks Apache Spark 2.4 introduces 29 new built-in functions for manipulating complex types (for example, array type), including higher-order...

Open Sourcing Databricks Integration Tools at Edmunds

November 12, 2018 by Shaun Elliott and Sam Shuster in
This is a guest post from Shaun Elliott, Data Engineering Tech Lead and Sam Shuster, Staff Engineer at Edmunds. What is Databricks and...

Introducing Apache Spark 2.4

November 8, 2018 by Wenchen Fan, Xiao Li and Reynold Xin in
UPDATED: 11/19/2018 We are excited to announce the availability of Apache Spark 2.4 on Databricks as part of the Databricks Runtime 5.0...

SQL Pivot: Converting Rows to Columns

November 1, 2018 by MaryAnn Xue in
Try this notebook in Databricks Check out the Why the Data Lakehouse is Your Next Data Warehouse ebook to discover the inner workings...

Simplifying Change Data Capture with Databricks Delta

October 29, 2018 by Ameet Kini and Denny Lee in
Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake Note: We also recommend...

MLflow v0.7.0 Features New R API by RStudio

Today, we’re excited to announce MLflow v0.7.0 , released with new features, including a new MLflow R client API contributed by RStudio...

What’s New for Apache Spark on Kubernetes in Apache Spark 2.4 Release

September 26, 2018 by Yinan Li in
UPDATED: 11/12/2018 This is a community blog from Yinan Li , a software engineer at Google, working in the Kubernetes Engine team. He...

Simplify Market Basket Analysis using FP-growth on Databricks

September 17, 2018 by Bhavin Kukadia and Denny Lee in
When providing recommendations to shoppers on what to purchase, you are often looking for items that are frequently purchased together (e.g. peanut butter...

Identify Suspicious Behavior in Video with Databricks Runtime for Machine Learning

September 12, 2018 by Raela Wang and Denny Lee in
With the exponential growth of cameras and visual recordings, it is becoming increasingly important to operationalize and automate the process of video identification...

Introducing Flint: A time-series library for Apache Spark

September 11, 2018 by Li Jin and Kevin Rasmussen in
This is a joint guest community blog by Li Jin at Two Sigma and Kevin Rasmussen at Databricks; they share how to use...