CNN for Images
CNNs are standard for image tasks.
Convolution Operation
Filter slides over image. Dot product at each position. Results in feature map.
Learnable filters detect edges, textures, patterns.
Classic Architectures
LeNet: early CNN for digits. AlexNet: ImageNet 2012 winner. VGG: deeper, 3x3 convolutions.
ResNet: residual connections enable very deep networks. Inception: parallel paths of different sizes.
Modern Networks
EfficientNet: compound scaling. Vision Transformers (ViT): transformers for images.
Key Takeaways
- Convolution detects spatial features
- Classic architectures: LeNet, AlexNet, VGG, ResNet
- Modern networks scale efficiently