Machine Learning - The Databricks Blog

How to Train XGBoost With Spark

by

XGBoost is currently one of the most popular machine learning libraries and distributed training is becoming more frequently required to accommodate the rapidly increasing size of datasets. To utilize distributed training on a Spark cluster, the XGBoost4J-Spark package can be used in Scala pipelines but presents issues with Python pipelines. This article will go over...

Lakehouse Architecture from vision to reality - register