Words as Numbers
Vector Structure
A word vector typically comprises 50 to 300 real-valued dimensions. Each dimension encodes some aspect of meaning, though these aspects are not explicitly labeled — the structure emerges from training on large-scale text corpora.
Select different words below to examine their vector representations:
Explore Word Vectors
Green bars = positive values, Red bars = negative values. Each bar is one dimension.
Cross-Word Comparison
The critical property of word vectors is that semantically related words exhibit similar activation patterns across dimensions. Compare two words to observe this correspondence:
Compare Two Words
Green = Word 1, Blue = Word 2. Notice how similar words show similar patterns.
Dimension Interpretation
Individual dimensions are not explicitly programmed with semantic labels. The representational structure emerges from distributional patterns in the training data. However, post-hoc analysis has identified dimensions that loosely align with interpretable axes:
- Gender (masculine ↔ feminine)
- Royalty (royal ↔ common)
- Scale (large ↔ small)
- Sentiment (positive ↔ negative)
Emergent structure: Semantic organization arises from statistical regularities in text, not from explicit annotation. This is how neural networks acquire conceptual representations without supervised labeling of meaning.
Key Takeaways
- Word vectors comprise 50–300 real-valued dimensions
- Each dimension encodes some aspect of distributional meaning
- Semantically related words exhibit similar vector patterns