Causal Mask - AI & ML Glossary | Farez Vadsaria

Large Language Models

Causal Mask

An attention mask ensuring tokens can only attend to previous positions, crucial for autoregressive generation.

This concept is essential for understanding large language models and forms a key part of modern AI systems.

Attention Mask
Autoregressive
GPT

Tags

large-language-models attention-mask autoregressive gpt

Related Terms

Attention Mask

A binary mask indicating which tokens should be attended to, used to handle padding and causal masking.

GPT

Generative Pre-trained Transformer - an autoregressive language model architecture that predicts the next token given previous context.

← Back to All Terms