How to Train XGBoost With Spark

by

XGBoost is currently one of the most popular machine learning libraries and distributed training is becoming more frequently required to accommodate the rapidly increasing size of datasets. To utilize distributed training on a Spark cluster, the XGBoost4J-Spark package can be used in Scala pipelines but presents issues with Python pipelines. This article will go over...

DATA+AI Summit

Discover the latest advances in Apache Spark, Delta Lake, MLflow, Koalas, Redash and more
REGISTER NOW

No hype, no spin Data Brew vidcast exploring the evolution of Data + AI.
WATCH NOW