Giving users the ability to rate their interactions has become a table stakes pattern in service and conversational experiences (think chat support, or Uber).

On its face this pattern is rather tame and familiar. Its potential risk factor may be buried and hidden to the user. What happens after they rate their experience?

  • In scenario A, the user knows they have been interacting with the model. A thumbs-up or down signals to prompt engineers whether the design of the model itself is effective. This could be especially helpful for proprietary internal models or secure models trained on sensitive data.
  • In scenario B, the user doesn’t know if they are interacting with a human or a model. OR they don’t know what experiments the company is running to potential replace human engagement with digital engagement. Even an average person could feel put off by the ethical implications from that lack of transparency.

The effects on this may be innocuous, but with unknown actors in the space, and a lack of transparency into training data, asking for this type of input from the user without providing a direct benefit can degrade trust.

In application, this pattern is fairly standardized, as thumbs or stars, with a few outliers. We should not expect it to change much.

What we should expect to see, or at least hope to see, is more information about what happens based on the user's rating, and transparency to the user about whether they are rating the response to their request, or the model as a whole.

Details and variations

  • Generally consists of a thumbs up or thumbs down
  • May also include other questions like comparing two different versions or the quality of regenerations
  • Some ratings explicitly say it will be used to improve responses to the user, but in most cases the effect is unclear

Considerations

Positives

Improves the overall experience
Engineers and designers get realtime feedback into situations where the model is failing to product its intended results.

Empowers users
So long as the feedback is being used to make improvements to the prompter's experience, this pattern allows users to operate as contributors to the model's overall strength.

Potential risks

Ethical risk
If some cases this information is used to determine how well the model is performing at replacing human labor (as opposed to simply tuning the model itself). People could be upset to learn that their input is helping to displace people from jobs. Companies should be upfront about how they are using this data.

No immediate value
If no additional affordance is provided to improve the user's experience, companies are collecting user data with no immediate or cathartic value returned in exchange [“if the service is free, you are the product”]. Avoid this by offering suggestions to the user for how to get better results, or teach them how to improve their results by giving their feedback directly to the bot.

Use when:
You want to assess how well the AI is performing against the user's expectations.

Examples

Jasper
Notion
Google
Github
ChatGPT
Julius stars
In addition to asking for feedback, consider teaching the user how to get better results from the bot directly
No items found.