Representations
This part turns raw text into vectors the model can compute with. The goal is to make each row of the model’s input matrix carry both identity and position.
- In 3 Tokens — Text to Numbers, you will see how text becomes token IDs, and how a small byte-pair encoding tokenizer works.
- In 4 Embeddings — Numbers to Meaning, token IDs become learned vectors through an embedding matrix.
- In 5 Positional Encoding — Giving Order to Meaning, those vectors gain order so the model can distinguish “dog bites man” from “man bites dog.”