Describe Video Content
Generate video descriptions.
Methods
Video to sequence. CNN/LSTM. Video Transformers.
Datasets
MSVD. MSRVTT. VATEX.
Temporal Modeling
Video understanding. Temporal attention.
Key Takeaways
- Video to text
- Temporal modeling important
- Multi-modal understanding