What’s a token

Background

When you prompt a Large Language Model (LLM), it doesn’t analyze individual characters or words directly. Instead, before your input reaches the LLM, it is first converted into tokens. This is because these models excel at understanding statistical relationships between tokens and predicting the next token in a sequence.

Based on your prompt, the model processes x tokens, predicts the next token, resulting in x+1 tokens. It then predicts the next token again to reach x+2 tokens, and this process continues until a stopping condition is met.

Why Understanding Tokens is Important

If you ever use an LLM, especially programmatically through an API, the cost is usually measured per token (often per thousand or per million tokens for convenience). This is known as the ‘Token Price.’

Moreover, whether you’re hosting the model yourself or using one hosted by another company, there is a maximum number of tokens the model can handle in a single request. This limit is called the ‘Token Limit.’

When working with an LLM, it’s crucial to know the token limit and how it applies to you. This knowledge allows you to include as much relevant information as possible in your prompts, optimizing the responses and making the most out of the cost per token.

What is a Token?

A token is a group of characters, typically a common sequence of characters, which may or may not be part of a larger sequence. The definition of a token can vary because each model has its own unique tokenisation process.

Generally, a token can be as large as an entire English word. However, longer or less common words might be broken down into multiple tokens. On average, one token is about 4 characters, and roughly 100 tokens correspond to about 75 words.

Understanding tokens helps you better interact with LLMs, manage costs, and make the most efficient use of the model’s capabilities.

Scroll to Top