The Ultra-Scale Playbook

The Ultra-Scale Playbook

Training LLMs on GPU Clusters

ByNouamane TaziFerdinand Mom

Usually printed in 3 - 5 business days
Embark on a journey to orchestrate thousands of GPUs to scale LLM training to the largest compute clusters today. Starting with the memory and compute anatomy of model training we then explore 5 dimensions of parallelism to distribute training efficiently. From there we dive deeper into how GPUs are designed and how specialised kernels help increase training efficiency further. This book is a great starting point if you want to get into training ever larger models efficiently at scale!

Details

Publication Date
Jul 28, 2025
Language
English
Category
Computers & Technology
Copyright
Creative Commons NonCommercial, ShareAlike (CC BY-NC-SA)
Contributors
By (author): Nouamane Tazi, By (author): Ferdinand Mom, By (author): Haojun Zhao, By (author): Phuc Nguyen, By (author): Mohamed Mekkouri, By (author): Leandro von Werra, By (author): Thomas Wolf, Cover design or artwork by: Florine Baeriswyl

Specifications

Pages
246
Binding Type
Paperback Perfect Bound
Interior Color
Color
Dimensions
Digest (5.5 x 8.5 in / 140 x 216 mm)

Ratings & Reviews