Deep Learning Regularization
Neural networks need regularization to prevent overfitting.
L2 Regularization
Weight decay: add λ||w||² to loss. Implementation: optimizer weight_decay parameter.
Dropout
Randomly drop neurons during training. Reduces co-adaptation. Use Dropout layer.
rate parameter: fraction to drop. Only during training.
Early Stopping
Monitor validation loss. Stop when it increases. Restore best weights.
patience: epochs to wait. Prevents overfitting automatically.
Data Augmentation
Image: random crops, flips, rotations. Text: back-translation, synonym replacement.
Increases effective training data. Reduces overfitting.
Key Takeaways
- Dropout randomly drops neurons
- Early stopping monitors validation
- Data augmentation increases data