What’s a token

A token is a group of characters, typically a common sequence of characters, which may or may not be part of a larger sequence. The definition of a token can vary because each model has its own unique tokenisation process.

Generally, a token can be as large as an entire English word. However, longer or less common words might be broken down into multiple tokens. On average, one token is about 4 characters, and roughly 100 tokens correspond to about 75 words.

Understanding tokens helps you better interact with LLMs, manage costs, and make the most efficient use of the model’s capabilities.

Scroll to Top