Session

Next-Level PySpark UDF Debugging

Overview

ExperienceIn Person
TypeBreakout
TrackData Engineering and Streaming
IndustryEnterprise Technology, Professional Services, Financial Services
TechnologiesApache Spark
Skill LevelIntermediate
Duration40 min

Debugging PySpark User Defined Functions (UDFs) has long been challenging due to the distributed execution model and limited runtime visibility. Traditional methods often require manually searching through scattered logs, making debugging slow and inefficient.

 

In this talk, we introduce a set of powerful UDF debugging improvements, including a new logging framework that provides structured, queryable insights into UDF execution. We also cover timeouts to stop long-running tasks, better error messages for easier debugging, and best practices for common UDF issues.

Session Speakers

IMAGE COMING SOON

Allison Wang

/Staff Software Engineer
Databricks

Takuya Ueshin

/Sr. Software Engineer
Databricks