Introduction
Embedding layers convert discrete tokens into dense vector representations that capture semantic relationships.
Basic Embedding
from tensorflow.keras import layers
# vocab_size=10000, embedding_dim=64, input_length=100
embedding = layers.Embedding(
input_dim=10000,
output_dim=64,
input_length=100
)
# Input: (batch, 100) integers
# Output: (batch, 100, 64)
Using Pre-trained Embeddings
import numpy as np
# Load pre-trained embeddings (e.g., GloVe)
embeddings_index = {}
with open('glove.6B.100d.txt') as f:
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
# Create embedding matrix
embedding_matrix = np.zeros((vocab_size, 100))
for word, i in word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
# Use in model
embedding_layer = layers.Embedding(
vocab_size, 100,
weights=[embedding_matrix],
trainable=False # Freeze embeddings
)
Trainable Embeddings
# Fine-tune embeddings
model = keras.Sequential([
layers.Embedding(10000, 128, input_length=100),
layers.GlobalAveragePooling1D(),
layers.Dense(32, activation='relu'),
layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Word2Vec-style Training
# Skip-gram style embeddings
model = keras.Sequential([
layers.Embedding(vocab_size, embedding_dim),
layers.Dense(vocab_size, activation='softmax')
])
# Use as part of larger NLP model
Practice Problems
- Create embedding layer
- Load pre-trained word vectors
- Fine-tune embeddings
- Use GlobalAveragePooling
- Implement embedding visualization