Attachments

Attachments allow users to shape a model’s logic by providing specific information for it to reference when building its response to the prompt. This grounds the AI’s response in a denser and more reliable context, reducing ambiguity, counteracting hallucinations, and helping users feel more in control of the generation.

Users can add attachments when shaping their initial prompt, as part of a follow up, or applied in a canvas or other shared space. These impact the model’s behavior in subtle ways like shifting token weights, or in explicit ways like limiting answers to a provided PDF.

Attachment methods

While a paperclip icon has become ubiquitous in most direct input fields, uploading a file is far from the only way of sharing contextual sources with the AI:

  • Direct upload: Added via file upload, connection (e.g. Google Docs or Notion page), or by @-mentioning a file, tab, agent, or conversation directly into the open text input box. Use when it’s unclear which reference to pull from behind the scenes and where user-initiated intent is necessary such as for the initial prompt.
  • Inline action: Commonly implemented as text selection, where inline actions appear relative to the highlighted area. Media content (image, video, audio) often includes overlays or menu options that start a new generation with the media file as an attached source. In conversational experiences, an action to quote the text may appear when text is highlighted. When selected, the action injects the copied quote into the text box.
  • URL embed: Added by pasting a link into the input box. The AI fetches the page and treats it as context while forming its generation. Be clear whether the AI will reference only the linked page or all connected pages.
  • Canvas block: Use the pointer action to select a single div, node, drawing, etc. on an open canvas surface. Depending on the context, the AI may prioritize this content for its next generation, or limit its regeneration and subsequent actions to the constraint of the selected area.
  • Live capture: Supplied by recording or snapping media in the moment, such as taking a screenshot, photo, or audio clip or allowing an agent to interact with your browser. In many modes this will result in the selection being attached as a direct upload. In agentic mode, the attachment guides the AI’s logic to form its next step, and may not require human interaction.

Using attachments as a style guide

Attachments can provide the model with tokens that help convey the user’s intent without requiring or relying on the user to detail them in full. This helps the model infer the user's intended content, structure, tone, style, etc., saving the user time while making it more likely the initial generation suits their intent.

To maximize user control and oversight, explore how you might let users understand what tokens the attachment is likely to contribute to the prompt, such as by providing a describe action for attachments.

Attachments can be used to guide the generation of something, like examples attached to a prompt to direct Claude, writing samples to generate a brand voice, or image samples to maintain tone across a set of generations

Using attachments as the primary source or subject

Attachments can be used to focus the AI on its contents, such as when summarizing an idea or report, conveying a script or outline, or to constrain search to a specific area of a subject. The AI may be directed to focus only on the attachment, or to combine it with other sources it finds on its own. All sources should be listed as references to the final generation.

Visually separate attachments used for guides vs. primary sources. In this case, provide citations to the areas of the attachment the AI leveraged in its generation. If the model retreives additional sources, distinguish those added directly by the user.

Attachments can be used as the central object of a prompt, like asking GitHub Copilot to explain some piece of code, having Adobe PDF explain what is happening in a document, or having Perplexity expand on the ideas contained in a white paper

Design considerations

  • Allow attachments at any time. Let users include files when first composing a prompt or during later regenerations so they can refine results without losing work.
  • Use multiple input methods. Not every attachment needs to be a file. Enable uploads, @ mentions, clipboard paste, drag-and-drop, and canvas references to reduce friction across devices and workflows.
  • Let users give attachments a purpose. Make clear whether a reference will guide style, serve as the subject under analysis, or act as a hybrid. Midjourney's attachment pane offers a good example, where users can specify whether it should direct the prompt, style, or subject.
  • Provide citations when referencing files. Quote, annotate, or link back to attachment content so users can trace the AI’s outputs directly to their sources, as Granola does with transcripts.
  • Protect organizational data. Encrypt attachments in transit and at rest, separate them from training data pipelines by default, and allow enterprises to enforce stricter controls.

Examples

Midjourney accepts attachments with multiple purposes, from style guides to composition, and even to identify a specific subject
As demonstrated by Perplexity, some products interact with the document within the surface of the interaction as a way of showing progress and making it clear how it is using the attached source