PySpark in 2023: A Year in ReviewMarch 25, 2024 by Hyukjin Kwon, Takuya Ueshin, Allison Wang, Ruifeng Zheng, Xinrong Meng, Haejoon Lee and Amanda Liu in Industries With the releases of Apache Spark 3.4 and 3.5 in 2023, we focused heavily on improving PySpark performance, flexibility, and ease of use...
Python Dependency Management in Spark ConnectNovember 13, 2023 by Hyukjin Kwon and Ruifeng Zheng in Engineering Blog Managing the environment of an application in a distributed computing environment can be challenging. Ensuring that all nodes have the necessary environment to...