As AI becomes more prolific, it's increasingly important for people and for platforms to be able to differentiate between generated content and human-created content. For consumers, this helps you sort through the bombardment of online content with more confidence and agency. For creators, this protects the rights of your content from reproduction and counterfeits. For researchers and platform owners, this protects the quality of the inputs into your work.

Watermarks help us by adding or identifying tracers into text, images, sounds, and video when it's created digitally by a model.

[Note: For the sake of simplicity in terms, this will combine the two approaches of digital watermarking and content provenance into one term]

Watermark types

Overlay watermarks are visual symbols or text added superficially to the content as a post-processing technique. They can be added by generators or assigned by platforms for the content is seen. Overlay watermarks are not integrated into the structure of the content and can therefore be easily removed.

Steganographic watermarks embeds patterns into the content that are imperceptible to humans. These might include subtle changes to pixels in video and images, additional spaces in text, or slight note changes in audio files. While slightly better protected than overlay watermarks, these can be masked intentionally or by bad actors through small changes to the file such as adding a gaussian blur.

Machine learning watermarks user a new approach that relies on machine learning models to add a distinct key to the content that can only be read by other models. This is the process Meta is using in their SynthID approach. They are stronger than the first two types, but can be degraded as the image is modified.

Statistical watermarks are random patterns injected into the image, audio, or text by the generator itself. While still impacting the content at the surface level (and not the foundational structure of the content), the randomness of this approach makes it far more difficult to crack or mask, especially for the average use.

Compared to content provenance

This alternative approach embeds data into the origin (or provenance) of the content itself, creating a digital fingerprint on the content's metadata. Content provenance requires the cooperation of content generators and digital platforms to create and enforce these standards.  Many large tech platforms have signed onto a pledge to incorporate standards developed by the Coalition for Content Provenance and Authenticity (C2PA), which defines a common format that is cannot be tampered with without leaving traces of manipulation. This data could then be read by any platform that adopts the protocol, including the full history of modifications.

The Content Credentials Verifier allows you to upload content and see what watermarks are embedded in its meta data

Regulatory landscape

Early steps are being taken by many governments to enforce watermarks in both scenarios.

  • China requires source-generated watermarks, as well as the footprint pattern of including metadata in the filename when downloaded.
  • The European Union's AI Act imposes similar standards regarding labeling generated content, and also imposes requirements on foundation models to ensure transparency and compliance with copyright law.
  • In the United States, a 2023 Executive Order established standards for watermarking and regulation, but requires action by Congress to codify these standards for enforcement.

The implications of this regulatory tapestry remain to be seen. If you are working on a foundational model or generative tool, you can prepare by developing principals and standards to evaluate the product experience. If you are working on a platform, think through how to label this content in different scenarios.

There is not a conventional approach established that mitigates the conflicting risk to this issue, and most likely it will need to be solved through a combination of regulation and source code. Additional innovation is being pursued through other means, like MIT's work on PhotoGuard, which essentially blocks generative tools from manipulating certain aspects of an image.

Designers can lean in to help establish human-centered principles and conventions, and carefully construct copy and caveats to help users be as informed as possible.

If you want to track the progress of this issue, here are some resources to get started:

Articles and Explainers
Policies, Protocols, and Tools

Details and variations

Watermarks need to be accessible to be useful, but their form may change depending on the use.

  • Watermarks can be visible to the user or imperceptible, depending on the use
  • Give users some way to verify the authenticity of content if watermarks are not visible to the naked eye
  • Consider ways to offer additional context such as caveats to inform them of the risk from false negatives and false positives in watermarking
  • Where possible, use the emerging standards like the CR mark to decrease the cognitive load on users

Considerations

Positives

Increase user agency

For low-stakes usage like images in a social media feed, simply being able to identify AI generated content most of the time could be a huge boon to consumers who don't want to be duped, but where the harm of liking an image that turns out to be AI generated is relatively low.

Collective action

The combined forces of the rise of misinformation with the invention of generative AI has led governments and large platforms to rally together to solve the issue. The regulatory landscape is still incomplete and inconsistent, but with the support of private and public companies, consumers will benefit from a multi-layer coalition to meaningfully solve this problem.

Potential risks

Vulnerability

The technology to support and enforce watermarks is nascent and unreliable at best. Researchers have found that AI-generated images and text can be easily manipulated within significant decreases in quality in order mask or fully remove embedded watermarks.

Lack of standards

Protocols for more advanced techniques like content provenance are only useful if they are adopted widely enough to compel and aid enforcement. While some  like C2PA are achieving widespread adoption by large tech organizations, we are still in the early days.

False negatives

If we fail to accurately label AI content, it creates the perception for users and downstream products that the content is human-generated (assuming watermarking of some form is being used). This can create harm and degrade trust.

False positives

Rotating too far in the other direction can also have a bad impact, like accusing people of plagiarizing content they actually wrote.

Use when:
Consumers and the products they use need to be able to distinguish synthetic content so it can be identified and segregated as needed from content generated by humans.
Related patterns:
No items found.

Examples

Adobe's Content Credentials protocol has been adopted by TikTok and other platforms with a shared icon intended to be a universal indicator of synthetic content
Adobe Firefly warns users on their first use that it will watermark any images generated with its tools
TikTok became the first social media company to support the Content Credentials Metadata tags for all content created on their platform.
TikTok also applies watermarks to posts. Creators can label a post as AI Generated, or if the generative layer relates to the effects being used, it will be applied automatically by the platform.
Meta is taking a similar approach to TikTok, adding badges on AI Generated images and content (including captions) to posts across its products
Snap has developed its own proprietary symbol for AI content generated on its platform
IA Writer offers an example of a tool proactively watermarking generated text, distinguishing it in grey from text actually authored by the user
Tools like Writer.ai's AI DETECT help alert others to the use of auto-generated text, but can result in false positives
OpenAI is adopted the C2PA protocol to watermark content generated through its API and DALL-E (shown here using the Content Credentials verifier)
No items found.