SESSION

Exploring UDTFs (User-Defined Table Functions) in PySpark

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPELightning Talk
TRACKData Engineering and Streaming
INDUSTRYEnterprise Technology
TECHNOLOGIESApache Spark
SKILL LEVELBeginner
DURATION20 min
DOWNLOAD SESSION SLIDES

User-Defined Table Functions (UDTFs) in PySpark are a powerful tool for custom data processing. This presentation explores the basics of UDTFs, including their structure and capabilities. We then delve into the concept of polymorphism and demonstrate how to make UDTFs polymorphic, enabling them to adapt to different input schemas and data types. Through practical examples, we showcase the versatility and power of both standard and polymorphic UDTFs in PySpark. Join us to gain a comprehensive understanding of UDTFs and learn how to enhance them with polymorphism for more flexible data processing.

SESSION SPEAKERS

Haejoon Lee

/Software Engineer
Databricks

Takuya Ueshin

/Senior Software Engineer
Databricks