Training & Optimization
Gradient Accumulation
Summing gradients over multiple batches before updating, simulating larger effective batch sizes.
This concept is essential for understanding training & optimization and forms a key part of modern AI systems.
Related Concepts
- Training
- Batch Size
- Memory Efficiency
Tags
training-optimization training batch-size memory-efficiency