About

microgpt-ts is a complete GPT built from scratch in TypeScript with zero runtime dependencies, inspired by Andrej Karpathy's microgpt. It implements a GPT-2-like architecture with a tokenizer, autograd engine, multi-head attention, and Adam optimizer. It includes training and inference loops in ~400 lines of readable code.

This is an educational project. The full source code is on GitHub, each implementation step is a separate pull request you can read through, and the playground lets you train and run the model directly in your browser.

What's inside

  • The microgpt-ts library: a Value autograd engine, GPT-2 architecture (embeddings, multi-head attention, MLP, residual connections, rmsnorm), and Adam optimizer
  • A browser playground where you can train the model and generate text with no install and no backend

Learn step by step

Following Karpathy's blog post, the model is built up one concept at a time. Each step introduces a new idea and is its own pull request, so you can follow the progression from a lookup table to a full GPT:

  1. Bigram count table — no neural net, no gradients
  2. MLP + manual gradients + SGD
  3. Autograd — a Value class that replaces manual gradients
  4. Single-head attention — position embeddings, rmsnorm, residual connections
  5. Multi-head attention + layer loop — full GPT architecture
  6. Adam optimizer

Differences from the original

Karpathy's original microgpt is a single Python script optimized for brevity. microgpt-ts takes a slightly different approach, prioritizing readability. The code is split into files and everything is typed. Math operations are broken out with helper functions like dotProduct, transpose, and mean.

The result is a reusable library packaged as a module, not a standalone script. The playground imports it directly. And because it's TypeScript, it runs natively in the browser with no Python runtime or backend required.

Credits

Inspired by Andrej Karpathy's microgpt.

Built by @dubzdubz. Source on GitHub.