Large Language Models

The context window is a critical limitation of language models. It determines how much information the model can “remember” and work with simultaneously.

Evolution

  • Early models: 512-2048 tokens
  • GPT-3: 2048-4096 tokens
  • Modern models: 8K-200K+ tokens (Claude, GPT-4, Gemini)
  • Research models: 1M+ tokens

Technical Constraint

Context windows are limited because attention mechanisms have O(n²) complexity with sequence length, making longer contexts computationally expensive and memory-intensive.

Practical Impact

Larger context windows enable:

  • Processing entire codebases or books
  • Maintaining longer conversations
  • Working with more examples in prompts
  • Better understanding of complex, interconnected information

Tags

llm architecture limitations

Related Terms

Added: January 15, 2025