← Back to Data Science

All Topics

Advertisement

Learn/Data Science/Data Science Applications

Word Embeddings

Topic: NLP

Advertisement

Word Vector Representations

Word embeddings map words to dense vectors.

Word2Vec

Skip-gram: predict context from word. CBOW: predict word from context. Negative sampling speeds up training.

Pre-trained: GoogleNews vectors, GloVe vectors.

Contextual Embeddings

ELMO: bidirectional LSTM. BERT: transformer-based. These give different embeddings per context.

Using Embeddings

gensim.models.Word2Vec. Load pre-trained: KeyedVectors.load_word2vec_format.

Key Takeaways

  1. Word2Vec and GloVe provide static embeddings
  2. BERT provides contextual embeddings
  3. Pre-trained vectors useful for downstream tasks

Advertisement

Advertisement

Need More Practice?

Get personalized data science help from ChatWhole's AI-powered platform.

Get Expert Help →