What is a token?
5 min read
·┌──────────────────────────────────────────────────────────┐ │ ═══════════════════════════════════════════════════ │ │ ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ │ ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ │ ──────────────────────────────────────────────────── │ │ ██████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░ │ │ █████████████████████████████████░░░░░░░░░░░░░░░░░░ │ │ ██████████████████████████████████████░░░░░░░░░░░░░ │ │ ████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ │ ──────────────────────────────────────────────────── │ │ ███████████████████████████████████████░░░░░░░░░░░░ │ └──────────────────────────────────────────────────────────┘
Tokens are the basic units that AI language models use to process text. Understanding tokens helps you understand how models work and how much things cost.
What Are Tokens?
Tokens are pieces of text that models break words into. A token can be:
- ▸A whole word: "hello"
- ▸Part of a word: "ing" from "running"
- ▸A punctuation mark: "."
- ▸A space or special character
[Rough estimate]: 1 token ≈ 4 characters or 0.75 words in English
Why Tokens Matter
[Pricing]: Most AI models charge per token. Understanding tokens helps you understand costs.
[Limits]: Models have token limits (context windows). You need to stay within these limits.
[Efficiency]: Shorter prompts use fewer tokens, which costs less and processes faster.
Token Limits
Different models have different token limits:
- ▸[GPT-4]: Up to 128,000 tokens
- ▸[Claude 3]: Up to 200,000 tokens
- ▸[GPT-3.5]: Up to 16,000 tokens
These limits apply to both input (your prompt) and output (the model's response) combined.
Counting Tokens
[Input tokens]: The text you send to the model (your prompt) [Output tokens]: The text the model generates (its response) [Total tokens]: Input + output = what you're charged for
Examples
- ▸"Hello, how are you?" ≈ 5-6 tokens
- ▸A typical email ≈ 50-100 tokens
- ▸A page of text ≈ 250-500 tokens
- ▸A short article ≈ 1,000-2,000 tokens
Best Practices
[Be concise]: Shorter prompts use fewer tokens and cost less [Monitor usage]: Track token usage to understand costs [Stay within limits]: Keep prompts and expected outputs within model limits [Use efficiently]: Remove unnecessary text from prompts
Tools
Many tools can help you count tokens:
- ▸OpenAI's tokenizer
- ▸Online token counters
- ▸API responses often include token counts
Understanding tokens is essential for effectively using and budgeting for AI models.