Transformer Core

This part builds the transformer block from its major components. The chapters move from one attention head to a full block that can be stacked into a GPT model.