Attention Lesson 1 of 4

Why Context Matters

The Limitation of Static Representations

The previous course established that words can be represented as numeric vectors. However, static word vectors exhibit a fundamental limitation: a given word receives an identical vector regardless of context.

Consider the word "bank" in two distinct contexts:

"I went to the bank to deposit money"

"I sat on the river bank watching fish"

The lexical form is identical, but the semantic content differs entirely. Under a static embedding scheme, "bank" maps to the same vector in both sentences — a representational failure that the attention mechanism resolves.

Interactive Demonstration

The "bank" Problem

Click on each sentence to see how context changes the meaning of "bank".

Context as a Disambiguation Mechanism

Human language comprehension resolves lexical ambiguity through surrounding words: "deposit" and "money" activate the financial interpretation, while "river" and "fish" activate the geographical one.

The attention mechanism formalizes this process. For each token in the input, attention computes a weighted relevance score over all other tokens, enabling the model to construct context-dependent representations.

Principle: Token meaning is a function of context. Attention is the computational mechanism that captures this dependency.

Key Takeaways

  • Static vectors produce identical representations regardless of context
  • Lexical ambiguity requires context-dependent representation
  • Attention resolves this by computing relevance weights across all tokens