Session
Next-Level PySpark UDF Debugging
Overview
Experience | In Person |
---|---|
Type | Breakout |
Track | Data Engineering and Streaming |
Industry | Enterprise Technology, Professional Services, Financial Services |
Technologies | Apache Spark |
Skill Level | Intermediate |
Duration | 40 min |
Debugging PySpark User Defined Functions (UDFs) has long been challenging due to the distributed execution model and limited runtime visibility. Traditional methods often require manually searching through scattered logs, making debugging slow and inefficient.
In this talk, we introduce a set of powerful UDF debugging improvements, including a new logging framework that provides structured, queryable insights into UDF execution. We also cover timeouts to stop long-running tasks, better error messages for easier debugging, and best practices for common UDF issues.
Session Speakers
IMAGE COMING SOON
Allison Wang
/Staff Software Engineer
Databricks
Takuya Ueshin
/Sr. Software Engineer
Databricks