Skip to main content
AI FundamentalsBeginner

LLM Tokenization Explained

Tokens are the atomic units LLMs process — not words, but subword pieces. Understanding tokens helps you write better prompts and manage API costs.

TL;DR: Tokens are the atomic units LLMs process — not words, but subword pieces. Understanding tokens helps you write better prompts and manage API costs.

What is a Token?

A token is a chunk of text — roughly 3-4 characters in English. "ChatGPT" is 3 tokens: "Chat", "G", "PT". Common words like "the" are 1 token; rare words may be split into many. Most LLMs charge per token.

tokentokenizerBPE