Large Language Models

Causal Mask

An attention mask ensuring tokens can only attend to previous positions, crucial for autoregressive generation.

This concept is essential for understanding large language models and forms a key part of modern AI systems.

  • Attention Mask
  • Autoregressive
  • GPT

Tags

large-language-models attention-mask autoregressive gpt

Related Terms

Added: November 18, 2025