Skip to main content
Page 1
Industries category icon 1

PySpark in 2023: A Year in Review

With the releases of Apache Spark 3.4 and 3.5 in 2023, we focused heavily on improving PySpark performance, flexibility, and ease of use...
Engineering blog

Simplify PySpark testing with DataFrame equality functions

The DataFrame equality test functions were introduced in Apache Spark™ 3.5 and Databricks Runtime 14.2 to simplify PySpark unit testing. The full set...
Engineering blog

10 Minutes from pandas to Koalas on Apache Spark

This is a guest community post from Haejoon Lee, a software engineer at Mobigen in South Korea and a Koalas contributor. pandas is...