Review:
Static Word Embeddings (word2vec, Glove)
overall review score: 4.2
⭐⭐⭐⭐⭐
score is between 0 and 5
Static word embeddings such as Word2Vec and GloVe are techniques in natural language processing that generate fixed vector representations for words. These embeddings capture semantic and syntactic relationships between words based on their distributional context in large text corpora, enabling algorithms to understand word similarities and analogies efficiently without contextual variation.
Key Features
- Pre-trained, fixed vector representations of words
- Captures semantic and syntactic relationships
- Generated from large text corpora using models like Word2Vec or GloVe
- Efficient to use in downstream NLP tasks due to their vector form
- Not context-dependent; each word has a single embedding regardless of usage
- Widely adopted and well-understood in the NLP community
Pros
- Simple and computationally efficient to produce and deploy
- Effective at capturing fundamental semantic relationships (e.g., king - man + woman ≈ queen)
- Provides a solid baseline for many NLP applications
- Supported by extensive research and available pre-trained models
Cons
- Lack contextual understanding; unable to differentiate meanings based on context (polysemy)
- Fixed embeddings may not adapt well to new or specialized vocabulary without retraining
- Less effective for tasks requiring nuanced understanding than contextual embeddings (like BERT)
- Can encode biases present in training data, leading to problematic outputs