The Ultra-Scale Playbook

Name: The Ultra-Scale Playbook
SKU: 45yk9dj
Price: 30.00 USD
Availability: InStock

Training LLMs on GPU Clusters

Usually printed in 3 - 5 business days

Embark on a journey to orchestrate thousands of GPUs to scale LLM training to the largest compute clusters today. Starting with the memory and compute anatomy of model training we then explore 5 dimensions of parallelism to distribute training efficiently. From there we dive deeper into how GPUs are designed and how specialised kernels help increase training efficiency further. This book is a great starting point if you want to get into training ever larger models efficiently at scale!

Details

Publication Date: Jul 28, 2025
Language: English
Category: Computers & Technology
Copyright: Creative Commons NonCommercial, ShareAlike (CC BY-NC-SA)
Contributors: By (author): Nouamane Tazi, By (author): Ferdinand Mom, By (author): Haojun Zhao, By (author): Phuc Nguyen, By (author): Mohamed Mekkouri, By (author): Leandro von Werra, By (author): Thomas Wolf, Cover design or artwork by: Florine Baeriswyl

Specifications

Pages: 246
Binding Type: Paperback Perfect Bound
Interior Color: Color
Dimensions: Digest (5.5 x 8.5 in / 140 x 216 mm)

Keywords

Distributed Training Artificial Intelligence Large Language Models Machine Learning Generative AI High Performance Computing Cluster Computing Tensor Parallelism Sequence Parallelism Pipeline Parallelism Data Parallelism Context Parallelism ZeRO Optimizer CUDA CUDA Kernel PyTorch GenAI Expert Paralleism

Report This Content to Lulu