Prediction and Learning

This part follows the model from final hidden states to training updates. It explains how GPT chooses the next token, measures mistakes, and changes its weights.