RNN for Sequences
RNNs process sequential data by maintaining hidden state.
Basic RNN
SimpleRNN processes each element, updating hidden state. Output depends on current input and previous state.
Backpropagation through time (BPTT) trains RNNs. Vanishing gradients limit long sequences.
LSTM and GRU
LSTM (Long Short-Term Memory) uses gates to control information flow. Gates: input, forget, output.
GRU (Gated Recurrent Unit) simplifies LSTM with update and reset gates.
Both handle long-term dependencies better than basic RNN.
Applications
Text classification: embed + LSTM. Sequence-to-sequence: encoder-decoder. Time series forecasting.
Key Takeaways
- RNN maintains hidden state across sequences
- LSTM/GRU handle long-term dependencies
- Useful for text and time series