Technical Terms

Sparse Attention

An attention mechanism that computes relationships between a subset of token positions rather than all pairs; reduces quadratic scaling cost of full attention while preserving most information for relevant contexts

— defined in 152th Edition, Mar 24, 2026

1appearances

Mar 2026first appeared

Mar 2026most recent

Technical Termscategory

Across the corpus (1 defined)

Defined 152th EditionW12 · Mar 24, 2026