Large Language Models
Decoder-Only Model
A transformer architecture with only decoder layers, using causal masking for autoregressive generation (GPT family).
This concept is essential for understanding large language models and forms a key part of modern AI systems.
Related Concepts
- GPT
- Autoregressive
- Transformer
Tags
large-language-models gpt autoregressive transformer
Related Terms
GPT
Generative Pre-trained Transformer - an autoregressive language model architecture that predicts the next token given previous context.
Transformer
A neural network architecture introduced in 'Attention is All You Need' (2017) that relies entirely on self-attention mechanisms, becoming the foundation for modern LLMs.