Reinforcement Learning: An Interactive Guide

Master RL concepts through visualizations, interactive examples, and hands-on simulations. Based on Sutton & Barto's classic textbook.

Interactive Visualizations

See algorithms in action with animated diagrams and real-time simulations.

Hands-on Examples

Play games, adjust parameters, and experiment with different strategies.

LaTeX Equations

All mathematical formulas rendered beautifully with clear explanations.

Chapters

1

Introduction

Available

The Reinforcement Learning Problem

Learn the fundamentals of RL: what it is, how it differs from other ML paradigms, and the key elements that make up any RL system.

Start Chapter
2

Multi-armed Bandits

Available

Evaluative Feedback & Action Selection

Explore the simplest form of RL: the k-armed bandit problem. Learn action-value methods, exploration strategies, and gradient-based approaches.

Start Chapter
3

Finite Markov Decision Processes

Available

The MDP Framework

Formalize the RL problem using MDPs. Learn about states, actions, rewards, policies, and value functions.

Start Chapter
4

Dynamic Programming

Available

Planning with a Known Model

Solve MDPs when the environment model is known. Learn policy evaluation, policy improvement, and value iteration.

Start Chapter
5

Monte Carlo Methods

Available

Learning from Complete Episodes

Learn from experience without a model. Monte Carlo methods average sample returns to estimate value functions.

Start Chapter
6

Temporal-Difference Learning

Available

Bootstrapping Without a Model

Combine the best of MC and DP. TD methods learn from incomplete episodes by bootstrapping from current estimates.

Start Chapter
7

n-step Bootstrapping

Available

Unifying TD and Monte Carlo

Bridge TD(0) and MC with n-step methods. Control the bias-variance tradeoff by choosing how many steps to look ahead.

Start Chapter
8

Planning and Learning with Tabular Methods

Available

Integrating Model-Based and Model-Free RL

Learn how to combine model-based planning with model-free learning. Explore Dyna, prioritized sweeping, MCTS, and more.

Start Chapter
9

On-policy Prediction with Approximation

Available

Function Approximation for Large State Spaces

Scale RL to large state spaces using function approximation. Learn SGD methods, linear approximation, feature construction, and neural networks.

Start Chapter
10

On-policy Control with Approximation

Available

Learning Optimal Policies with Function Approximation

Extend function approximation to control. Learn semi-gradient Sarsa, the average reward setting, and differential value functions.

Start Chapter
11

Off-policy Methods with Approximation

Available

Learning from Different Policies

Explore the challenges of off-policy learning with function approximation. Understand the deadly triad, Bellman error, and stable algorithms like Gradient-TD.

Start Chapter
12

Eligibility Traces

Available

Unifying TD and Monte Carlo Methods

Discover eligibility traces, a mechanism that bridges TD and Monte Carlo methods. Learn about λ-returns, TD(λ), True Online TD(λ), and how traces enable efficient credit assignment.

Start Chapter
13

Policy Gradient Methods

Available

Learning Parameterized Policies Directly

Move beyond action-value methods to learn policies directly. Explore the policy gradient theorem, REINFORCE, actor-critic methods, and continuous action spaces.

Start Chapter
14

Psychology

Available

RL and Animal Learning

Explore the deep connections between reinforcement learning and psychology. Understand how RL algorithms correspond to classical and instrumental conditioning, the TD model, cognitive maps, and habitual vs goal-directed behavior.

Start Chapter
15

Neuroscience

Available

RL and the Brain

Discover the remarkable connections between reinforcement learning and neuroscience. Learn how dopamine neurons encode TD errors, the neural actor-critic architecture, and how RL provides a computational framework for understanding the brain's reward system.

Start Chapter
16

Applications and Case Studies

Available

RL Success Stories

Explore landmark applications of reinforcement learning: from TD-Gammon and Samuel's checkers to DeepMind's Atari-playing DQN and AlphaGo. See how RL has achieved superhuman performance in games and real-world systems.

Start Chapter
17

Frontiers

Available

The Future of RL

Explore the cutting edge of reinforcement learning: general value functions, temporal abstraction via options, partial observability, reward design challenges, and the role of RL in the future of artificial intelligence.

Start Chapter