Learning from Unlabeled Data
Use data itself as supervision.
Contrastive Learning
Positive pairs: augmented views. Negative pairs: different samples. InfoNCE loss maximizes mutual information.
SimCLR, MoCo, BYOL: contrastive frameworks.
Masked Prediction
BERT: mask tokens, predict. MAE: mask patches, reconstruct. DALL-E: masked pixel prediction.
Downstream Tasks
Pre-train on large unlabeled data. Fine-tune on labeled data.
Key Takeaways
- Use data as supervision
- Contrastive and masked prediction
- Pre-train then fine-tune