Build Your First LLM from ScratchPart 3 · Section 10 of 13

At Scale

ModelVocab SizeEmbed DimEmbedding Parameters
Our Calculator36642,304
GPT-250,25776838.6 million
GPT-350,25712,288617 million
GPT-4~100,000~12,288~1.2 billion
GPT-4's embedding layer alone (~1.2B parameters) is 500,000× larger than our entire embedding layer. Same concept, vastly different scale.
Helpful?