Large Language Models
Special Token
Reserved tokens with special meanings like [CLS], [SEP], [MASK], [PAD] used in various model architectures.
This concept is essential for understanding large language models and forms a key part of modern AI systems.
Related Concepts
- Token
- BERT
- Tokenization
Tags
large-language-models token bert tokenization
Related Terms
BERT
Bidirectional Encoder Representations from Transformers - a model that understands context by looking at text from both directions.
Token
The basic unit of text that a language model processes, typically representing a word, subword, or character. Tokens are the fundamental building blocks for LLM input and output.
Tokenization
The process of breaking text into smaller units (tokens) that language models can process, using algorithms like BPE or WordPiece.