SESSION
Exploring UDTFs (User-Defined Table Functions) in PySpark
OVERVIEW
EXPERIENCE | In Person |
---|---|
TYPE | Lightning Talk |
TRACK | Data Engineering and Streaming |
INDUSTRY | Enterprise Technology |
TECHNOLOGIES | Apache Spark |
SKILL LEVEL | Beginner |
DURATION | 20 min |
DOWNLOAD SESSION SLIDES |
User-Defined Table Functions (UDTFs) in PySpark are a powerful tool for custom data processing. This presentation explores the basics of UDTFs, including their structure and capabilities. We then delve into the concept of polymorphism and demonstrate how to make UDTFs polymorphic, enabling them to adapt to different input schemas and data types. Through practical examples, we showcase the versatility and power of both standard and polymorphic UDTFs in PySpark. Join us to gain a comprehensive understanding of UDTFs and learn how to enhance them with polymorphism for more flexible data processing.
SESSION SPEAKERS
Haejoon Lee
/Software Engineer
Databricks
Takuya Ueshin
/Senior Software Engineer
Databricks