Scaling XGBoost With Spark Connect ML on Grace Blackwell
Overview
Experience | In Person |
---|---|
Type | Breakout |
Track | Artificial Intelligence |
Industry | Retail and CPG - Food, Financial Services |
Technologies | Apache Spark |
Skill Level | Intermediate |
Duration | 40 min |
XGBoost is one of the off-the-shelf gradient boosting algorithms for analyzing tabular datasets. Unlike deep learning, gradient-boosting decision trees require the entire dataset to be in memory for efficient model training. To overcome the limitation, XGBoost features a distributed out-of-core implementation that fetches data in batch, which benefits significantly from the latest NVIDIA GPUs and the NVLink-C2C’s ultra bandwidth.
In this talk, we will share our work on optimizing XGBoost using the Grace Blackwell super chip. The fast chip-to-chip link between the CPU and the GPU enables XGBoost to scale up without compromising performance. Our work has effectively increased XGBoost’s training capacity to over 1.2TB on a single node.
The approach is scalable to GPU clusters using Spark, enabling XGBoost to handle terabytes of data efficiently. We will demonstrate combining XGBoost out-of-core algorithms with the latest connect ML from Spark 4.0 for large model training workflows.
Session Speakers
IMAGE COMING SOON
Bobby Wang
/Engineer
Nvidia
Jiaming Yuan
/Engineer
NVIDIA