SESSION

Efficient Stable Diffusion Pre-Training on Billions of Images with Ray

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPEBreakout
TRACKData Science and Machine Learning
INDUSTRYEnterprise Technology
TECHNOLOGIESAI/Machine Learning, GenAI/LLMs
SKILL LEVELIntermediate
DURATION40 min
DOWNLOAD SESSION SLIDES

Stable Diffusion demonstrates the impressive ability to produce high-quality images consistently. However, pre-training a Stable Diffusion model is challenging because it is a long-running workload that ingests billions of images with complex preprocessing logic on hundreds of GPUs. To maximize performance and cost efficiency at such a scale, we need to deal with challenges such as scaling out data preprocessing, improving GPU utilization, fault tolerance, and managing heterogeneous clusters. In this talk, we will introduce how to use Ray Data and Ray Train to build an end-to-end pre-training solution that achieves large-scale state-of-the-art performance. Takeaways: Easily implement an end-to-end stable diffusion pre-training pipeline on billions of images using Ray. Improve efficiency and stability in large-scale multimodal data processing with Ray Data. Scale online preprocessing and distributed training using different GPU types to increase GPU utilization and reduce costs.

SESSION SPEAKERS

Yunxuan Xiao

/Software Engineer
Anyscale Inc.

Hao Chen

/Staff Software Engineer
Anyscale Inc.