It's hard to capture the full intent of our inquiry and communicate it to the AI in a single prompt. Primary sources help by capturing a density of data for the AI to reference.

Depending on the use case you are operating in, references may serve to guide a generative outcome, or be the foundation by which the AI generates its response.

As a guide

We know that AI uses tokens, contained in data, to understand the user's intent through their prompt and return something that sufficiently matches that intent. References empower the user by giving the AI a clear depiction of what they are looking for. This makes the AI's job easier in turn, as it can use that reference as an initial filter to focus its data reference before building its response.

Not all uses of references need to be focused on obtaining predictable outcomes. Multi-media generators often allow you to upload a sample reference, which can guide the style, tokens, or both that the AI uses to generate your prompt. This is especially fun with inpainting and can result in unpredictable results of editorial quality.

References can be used to guide the generation of something, like examples attached to a prompt to direct Claude, writing samples to generate a brand voice, or image samples to maintain tone across a set of generations

As a source

Sometimes the reference acts as the subject of the prompt itself, such as when a user might ask the AI to summarize, synthesize, etc. These are the use cases where the lines between primary references and general references start to blur. The distinction is that a primary source is the object that the AI is explicitly interacting with, whereas references operate as a layer on top of the LLM - much like I might ask the AI to summarize some topic but include any internal resources at first.

In these situations, the user expects to be able to upload some document and have the AI answer questions about it specifically, or offer specific suggestions for how to change it. The describe function of image generators is an example of this happening explicitly, while Adobe PDF's AI functionality is a good example of blending Copilot chat with the central object.

Primary references can be used as the central object of a prompt, like asking Github Copilot to explain some piece of code, having Adobe PDF explain what is happening in a document, or having Perplexity expand on the ideas contained in a white paper

Details and variations

  • Primary sources are attached when the prompt is first generated, either by uploading it from the input or by prompting within the context of the document, video, etc itself
  • Primary sources are the primary anchor for the AI when included. Additional references may be used to help the AI expand its answer
  • Citations that the AI produces may relate to specific places within the primary source
  • Multiple primary sources can be included if helpful
  • The AI can include tokens in its prompt that are ex

Considerations

Positives

A picture speaks 1000 words

Giving AI an example of what you are looking for makes it far more likely that it will understand your intent, which can get lost in a prompt. This is especially helpful for more nuanced instructions, like voice and tone, which can be difficult to describe. Consider building an AI-friendly brand guide that can be uploaded as a source when people are using AI to generate first drafts for branded material.

Potential risks

Unseen bias

Every bit of information we provide the AI contains data that we cannot see. This data can lead to unexpected results. Take an example where a reference source includes plagiarized work or inaccurate findings, unbeknownst to the user. This can happen today, but due to automation bias we may be less likely to scrutinize the results that AI finds for us. As a result, this would remove agency from the user can degrade trust.

Security risk

When 3rd party information can be provided directly to the AI, the data contained in that resource is vulnerable to being scraped into the AI's training data. In most situations this would lead to very minimal harm. However if the reference included proprietary or personal data, this could result in unethical or even illegal practices, and the proliferation of data that is not yours to share. Consider ways to check for the ownership of the reference. For example, Udio will not allow users to upload a reference song if they don't have the rights to do so.

Use when:
You have a clear intent in mind for what the AI should produce or documents that the AI should explicitly interact with when forming its response

Examples

Midjourney and Adobe Firefly allow users to attach same images as a primary source for the AI to reference for additional tokens and details
Copy.ai can generate a tone of voice by references a sample piece of writing
Claude supports multiple primary sources to anchor the AI as it generates a response
Attached primary sources are visible in the Claude input box
Adobe PDF interacts directly with the PDF as a primary source
Github Copilot lives in your code editor so it can interact directly with your repo as a primary source
No items found.