Prep 4: Sequential Modelling

📋 Main Topics¶

Recurrent Neural Networks (Dive into Deep Learning, Chapter 9)
Long Short-Term Memory (LSTM) (Dive into Deep Learning, Section 10.1)
Gated Recurrent Units (GRU) (Dive into Deep Learning, Section 10.2)
Attention Mechanisms and Transformers (Dive into Deep Learning, Chapter 11)
🐼 Beginner-Friendly
- Encoder-Decoder Architecture (By Google)
- Transformer Models and BERT Model (By Google)

Understanding LSTM Networks (Christopher Olah - The definitive guide to LSTM internals)
The Unreasonable Effectiveness of Recurrent Neural Networks (Andrej Karpathy)
The Annotated Transformer (Harvard NLP - Line-by-line PyTorch implementation of the Transformer paper)
Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville:
Section 10: Sequence Modeling: Recurrent and Recursive Nets
- 10.2 Recurrent Neural Networks (Pages 372-388)
- 10.7 The Challenge of Long-Term Dependencies (Pages 396-399)
Supervised Sequence Labelling with Recurrent Neural Networks by Alex Graves:
Chapter 4: Long Short-Term Memory Pages 31-38
Visualizing RNNs
The Illustrated Transformer