📋 Main Topics¶
- Intro to Sequential Modeling \
- Recurrent Neural Networks (RNNs)
- Long Short-Term Memory (LSTM)
- Gated Recurrent Units (GRU)
- Attention Mechanisms
- Transformers
🧠 Class Activity - Labs¶
- Lab 3: Sequential Modelling
📚 Recommended Readings¶
- Recurrent Neural Networks (Dive into Deep Learning, Chapter 9)
- Long Short-Term Memory (LSTM) (Dive into Deep Learning, Section 10.1)
- Gated Recurrent Units (GRU) (Dive into Deep Learning, Section 10.2)
-
Attention Mechanisms and Transformers (Dive into Deep Learning, Chapter 11)
-
🐼 Beginner-Friendly
- Encoder-Decoder Architecture (By Google)
- Transformer Models and BERT Model (By Google)
📖 Optional (Advanced) Reading¶
- Understanding LSTM Networks (Christopher Olah - The definitive guide to LSTM internals)
- The Unreasonable Effectiveness of Recurrent Neural Networks (Andrej Karpathy)
-
The Annotated Transformer (Harvard NLP - Line-by-line PyTorch implementation of the Transformer paper)
-
Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville:
- Section 10: Sequence Modeling: Recurrent and Recursive Nets
- 10.2 Recurrent Neural Networks (Pages 372-388)
- 10.7 The Challenge of Long-Term Dependencies (Pages 396-399)
- Supervised Sequence Labelling with Recurrent Neural Networks by Alex Graves:
- Chapter 4: Long Short-Term Memory Pages 31-38
- Visualizing RNNs
- The Illustrated Transformer