AI relies on tokens, explicitly or implicitly, and each token carries with it meaning that might be imperceptible to the user. By giving users the ability to see the tokens that are informing the output, they can discovery new creative paths and hone their input.

Only a few platforms make use of this as an interface pattern. Notably Midjourney's /describe function reveals the tokens and implied prompt behind an image. This can be applied to images that the generator produced, or human-created files. Using the variations pattern, Midjourney reveals 4 different interpretations of the image, and makes it easy for the user to reproduce each option to evaluate how closely it can reproduce the original tone, subject, and context.

After using token layering and inpainting to generate this abstract image, I tried to reverse engineer it to see the tokens the generator relied on

This can produce some entertaining results, like learning "Joni Mitchell" was an influence for an abstract image. This also serves to ensure cleanliness in our images from copyright or copycatting. Midjourney explicitly calls out any artists whose work could be interpreted as an influence: both revealing the ethical flaws in its training data while helping users re-constitute their prompts to avoid that unintentional reference.

Alternatively, when interacting directly with an AI, you can simply ask it to share the tokens it relied on. This is not a technique layman users will be familiar with however, so it's unreliable as a solution. Furthermore, unless you are using the open input to communicate with the AI, there is no mechanism to ask for its intent. This leaves the user blind.

ChatGPT will tell you its reference tokens

Finally, tokens can be transparently shared in a gallery or in the details of a generated file. Audio generators like Udio include the top tokens in the description of songs in their examples gallery. This can help users search for relevant audio files, and help them gather clues for how to prompt for similar sounds. Other generators include the tokens in the meta data of the file itself.

Udio includes a full list of tokens associated with a song clip in the gallery. P.S. To listen to this banger of a song, visit it here

Details and variations

  • The mechanisms to have the AI share the tokens it relied on can be explicit, like a command, or implicit, such as in metadata
  • Consider showing multiple interpretations if the image has been created through multiple prompts, remixing, or inpainting
  • If it's necessary to constrain the number of tokens show, bias towards showing the most impactful tokens, as they will carry different weights
  • Consider treating tokens as parameters once revealed letting users increase the weight of certain tokens related to tone and style. For example, a token into a text prompt of "professional" could be followed up with a trigger to regenerate the response to be "more professional" or "more casual"

Considerations

Positives

See into the model

Token transparency lets users understand how the model is interpreting their input, giving them the ability to improve their prompts and constrain the AI's result

Avoid unintentional references

Some words carry bias and meaning that we don't intend, such as references to copyright-protected sources or unconscious biases. Seeing those connections is the first step to avoiding them

Potential risks

Intentional misuse

When token transparency alerts users to creators whose works were scraped in the training data for the model, it can make it easier for users to continue to reference that artists' work in their creations without just compensation

Visibility only

Token transparency can alert users to unintentional references and biases in their prompt, but it falls to the user to track down the impact of those unintentional tokens and revise their input

Examples

The /describe function on Midjourney will show four interpretations of the tokens behind an image
Midjourney's /shorten function goes a step further to show you the relative weights of tokens in a prompt
ChatGPT will tell you which tokens it was using to craft a response
Example galleries like the song listings on Udio reveal the tokens associated with the audio
Midjourney puts some tokens in the meta data of files for reference
No items found.