Releases
v0.1.0
jeffra
released this
19 May 06:41
DeepSpeed 0.1.0 Release Notes
Features
Distributed Training with Mixed Precision
16-bit mixed precision
Single-GPU/Multi-GPU/Multi-Node
Model Parallelism
Support for Custom Model Parallelism
Integration with Megatron-LM
Memory and Bandwidth Optimizations
Zero Redundancy Optimizer (ZeRO) stage 1 with all-reduce
Constant Buffer Optimization (CBO)
Smart Gradient Accumulation
Training Features
Simplified training API
Gradient Clipping
Automatic loss scaling with mixed precision
Training Optimizers
Fused Adam optimizer and arbitrary torch.optim.Optimizer
Memory bandwidth optimized FP16 Optimizer
Large Batch Training with LAMB Optimizer
Memory efficient Training with ZeRO Optimizer
Training Agnostic Checkpointing
Advanced Parameter Search
Learning Rate Range Test
1Cycle Learning Rate Schedule
Simplified Data Loader
Performance Analysis and Debugging
You can’t perform that action at this time.