Build Your First LLM from ScratchPart 3 · Section 13 of 13

Summary

Here's what we built and how it compares to GPT-4:

ComponentOur ModelGPT-4Ratio
Vocabulary36~100,0002,800×
Embedding dim6412,288192×
Embedding params2,304~1.2B520,000×
Max sequence32128,0004,000×

What You Can Now Do

  • Build a vocabulary for any task
  • Convert text to token IDs and back
  • Convert token IDs to embeddings
  • Add position information to embeddings
  • Understand how this scales to real models
Helpful?