AI relies on tokens to understand the user's prompt and translate it into the result. This input pattern is universal, and it adheres closely to how we communicate naturally as humans. We choose our words wisely because they hold meaning. Think about how different ways of saying "I exited" changes the tone and meaning:

  • I departed
  • I slinked away
  • I escaped
  • I drifted
  • I bowed out

Words carry meaning in their association too. For example, if I prompt a generator to show me a bear in a city, it will tend to place a brown bear in a generic western city. If I specify that I am asking for a Panda Bear, it will place the bear in China. Why? I didn't ask for a location change. This is an example of the bias inherent in our words. Panda Bears have a cultural association with China, so the AI pulls that connection into its definition of city when placing the bear in context.

Token layering is a technique that combines tokens intentionally in order to refine the AI's understanding of your prompt and the direction of its response. Some tools, such as image generators, initially leaned heavily into the pattern literally, instructing users to list and weight tokens as their primary prompt. This was so popular that sharing compelling Midjourney tokens became a cottage industry.

However, while this is highly effective for creative brainstorming or injecting specific styles and tones through in-painting, it can be difficult for users to grasp new techniques. Even then, it's hard for tokens to fully describe your intent.

Web interfaces appear to converging on an evolved direction that blends an open ended prompt with layered tokens. Take Adobe Firefly for example, which asks you to write what you are looking for while an easy palette on the side of the screen lets you choose from stylistic, structural, and referential tokens. This pattern is also seen in Udio and other music generators.

Adobe Firefly makes it easy to add and view tokens in a prompt

Tokens can also be collected as a follow up to the initial prompt. Jasper and Perplexity are two notable examples of products that auto-generate follow up questions after the prompt is submitted, which themselves serve as inputs to capture additional tokens. The effect is a system that is progressively trying to better understand the user's intent without it feeling like a burden up front.

Finally, tokens can be introduced as a follow up action to help the user instruct the AI on how to modify its result. In this case, suggestions could be from a set list (as is the case with Grammarly or Notion), or generated automatically from the existing context, as the follow up above does. This method of collecting tokens and additional parameters is another example of how progressive disclosure can be used to decrease the up-front lift on the user.

Details and variations

The interactive approach to token layering will vary by the context of its use:

  • In more technical interfaces, like using Midjourney from within Discord, token layering is manual and hand written by the user. Make use of related patterns like token transparency and parameters to help them fine tune their input
  • In web-based interfaces like Adobe Firefly, tokens can be more visual, and treated more like the traditional UX pattern of chips: distinct and dismissible
  • In either case, differentiate tokens from parameters and give users easy ways to better understand how tokens impact their final result. Midjourney's /shorten command is a good example of this, letting users easily see the relative weights of the tokens in their prompt

Considerations

Positives

Tokens as delight

Getting comfortable with AI can be intimidating, and even expert users will sometimes have difficulty nailing the result. When tokens can be easily added to the prompt (bonus point if the AI generates relevant tokens for the user to choose from), it reduces the necessary lift required to get the AI to product what the user is looking for.

Balance accuracy with progressive disclosure

Not all tokens need to be added at once, and in fact, starting with a small prompt or list of tokens and adding on can be an effective way to tune the result proactively. Consider how much context the AI requires to get a response that is on track, or to give the user suggestions for how to expand their input. Then, give a mechanism for the user to progressively add in more information until they reach their goal.

Use the AI to use the AI

Rather than having a set list of tokens for users to choose from or leaving it totally open ended, consider using the AI to generate suggested additional tokens for the user to add. This reduces overhead effort and builds trust.

Potential risks

Cost of iteration

There are real costs associated with generating multiple iterations of a result, so it can benefit the user and their bottom line to make it as easy as possible to build a robust prompt up front. Users may have a fixed set of credits to run AI generations, or they may get frustrated by the investment of time and opt out. Reach for a balance between learning through iteration and providing a high quality result as soon as possible that it close to their stated intent.

Use when:
A user wants to control the input of their prompt with the raw tokens themselves, either listed out or communicated in sentence form.

Examples

Zapier's image generator template includes a prompt to add tokens for style and description
Perplexity adds its follow ups to capture additional tokens after the first prompt has been inputted. Each follow up is also auto-generated
Jasper auto-generates follow ups that ask the user to input additional tokens to guide the generation
Grammarly relies on a fixed set of follow ups to capture additional tokens like voice and tone or audience
Figjam also uses progressive disclosure but presents their suggested tokens as fixed
Udio and other audio generators combine an open ended prompt with specific token prompts in order to return a close result on the first try
No items found.