Skip to main content
Page 1

Introducing Python User-Defined Table Functions (UDTFs)

Apache Spark™ 3.5 and Databricks Runtime 14.0 have brought an exciting feature to the table: Python user-defined table functions (UDTFs). In this blog...

Arrow-optimized Python UDFs in Apache Spark™ 3.5

In Apache Spark™, Python User-Defined Functions (UDFs) are among the most popular features. They empower users to craft custom code tailored to their...

Introducing Apache Spark™ 3.5

Today, we are happy to announce the availability of Apache Spark™ 3.5 on Databricks as part of Databricks Runtime 14.0. We extend our...

Spark Connect Available in Apache Spark 3.4

Last year Spark Connect was introduced at the Data and AI Summit. As part of the recently released Apache SparkTM 3.4, Spark Connect...

Introducing Apache Spark™ 3.4 for Databricks Runtime 13.0

Today, we are happy to announce the availability of Apache Spark™ 3.4 on Databricks as part of Databricks Runtime 13.0 . We extend...

Memory Profiling in PySpark

There are many factors in a PySpark program's performance. PySpark supports various profiling tools to expose tight loops of your program and allow...

How to Profile PySpark

In Apache Spark™, declarative Python APIs are supported for big data workloads. They are powerful enough to handle most common use cases. Furthermore...