Foundation Lesson 2 of 4

Words as Numbers

Vector Structure

A word vector typically comprises 50 to 300 real-valued dimensions. Each dimension encodes some aspect of meaning, though these aspects are not explicitly labeled — the structure emerges from training on large-scale text corpora.

Select different words below to examine their vector representations:

Explore Word Vectors

Select a word:

Green bars = positive values, Red bars = negative values. Each bar is one dimension.

Cross-Word Comparison

The critical property of word vectors is that semantically related words exhibit similar activation patterns across dimensions. Compare two words to observe this correspondence:

Compare Two Words

Word 1: Word 2:

Green = Word 1, Blue = Word 2. Notice how similar words show similar patterns.

Dimension Interpretation

Individual dimensions are not explicitly programmed with semantic labels. The representational structure emerges from distributional patterns in the training data. However, post-hoc analysis has identified dimensions that loosely align with interpretable axes:

Gender (masculine ↔ feminine)
Royalty (royal ↔ common)
Scale (large ↔ small)
Sentiment (positive ↔ negative)

Emergent structure: Semantic organization arises from statistical regularities in text, not from explicit annotation. This is how neural networks acquire conceptual representations without supervised labeling of meaning.

Key Takeaways

Word vectors comprise 50–300 real-valued dimensions
Each dimension encodes some aspect of distributional meaning
Semantically related words exhibit similar vector patterns