Padding Token - AI & ML Glossary | Farez Vadsaria

Large Language Models

Padding Token

A special token used to make sequences the same length in a batch, typically ignored during computation.

This concept is essential for understanding large language models and forms a key part of modern AI systems.

Special Token
Batching
Attention Mask

Tags

large-language-models special-token batching attention-mask

Related Terms

Attention Mask

A binary mask indicating which tokens should be attended to, used to handle padding and causal masking.

Special Token

Reserved tokens with special meanings like [CLS], [SEP], [MASK], [PAD] used in various model architectures.

← Back to All Terms