Large Language Models
Cross-Attention
Attention between two different sequences, where queries come from one and keys/values from another.
This concept is essential for understanding large language models and forms a key part of modern AI systems.
Related Concepts
- Attention Mechanism
- Encoder-Decoder
- Self-Attention
Tags
large-language-models attention-mechanism encoder-decoder self-attention
Related Terms
Attention Mechanism
A technique that allows neural networks to focus on relevant parts of the input when producing each output, assigning different weights to different input elements.
Encoder-Decoder
A architecture where the encoder processes input and the decoder generates output, used in translation and sequence-to-sequence tasks.
Self-Attention
A mechanism where each token attends to all other tokens in the sequence to understand contextual relationships.