trustworthyAI

Project: Selective Truthiness

Due: Friday, Mar. 20, 2025 | Noon (Eastern Time) Groups: Up to 3 students.


Objective

This project will explore a variation of LLM bias; do LLMs tell different people different things?

Your task is to ask the same factual question but frame the question as being posed by different stated identities (children, adults, men, women, soldiers, doctors, teachers, etc…). You can pick any model (GPT/Gemini/etc…) and search for scenarios where it changes its answer in ways that go beyond appropriate audience adaptation into actual distortion, omission, or paternalism?

While you are doing this, part of your job is to think about what is an appropriate version of changing the answer? Answers for children probably should be framed differently than answers for adults, but should represent the same underlying facts.

You are welcome to address the above question in any way you’d like. Below is one way that I think you could address this.


Possible Instructions

1. Select a Question Domain

Choose 1–2 factual question domains where you suspect the model might tailor its answers depending on who’s asking. Examples include.

2. Design Your Personas

Create personas that vary along at least 2 of the following axes (you can choose other axes as well)l:

Axis Example Values
Age / Life Stage Curious 14-year-old, college student, retired senior
Expertise Complete novice, informed layperson, domain expert (e.g., “I’m a doctor”)
Authority / Role Concerned parent, journalist, policymaker, teacher
Emotional State Calm and curious, anxious and worried, angry and demanding
Stated Intent “Just curious,” “writing a school report,” “making a personal decision”

3. Choose Your Approach (More technical/statistical vs. More )

Here I give two options for the approach. One leans on more technical depth and running larger scale experiments, and the other considers deeper analysis of personas and deeper analysis of the response.

Option 1: Scripted Persona Experiment (Technical/Statistical)

Option 2: Persona Deep-Dive Essay (Ethical/Behavioral)

Framework Key Question
Epistemic justice Does the model treat all personas as equally credible knowers?
Informed consent Does every persona get enough information to make autonomous decisions?
Paternalism vs. care When is tailoring protective and when is it infantilizing?
Rawlsian fairness Would the least advantaged persona find this system fair?
Other similar framework Choose your own adventure here

Turn in:

Turn in a blog post (approx. 3 screenfuls) discussing the following:

  1. The Setup: Explain your question domain(s), your persona design, and why you chose those specific axes of variation.
  2. Methodology & Results:
    • For Option 1: Your scripting approach, the metrics you measured, statistical results, and at least 2 side-by-side response comparisons that illustrate interesting differences.
    • For Option 2: Your character sheets, key excerpts from transcripts, and your close-reading analysis tied to at least one ethical framework.
  3. So What?: Where do you draw the line between helpful adaptation and harmful distortion? What would a “fair” LLM look like for this domain? Propose at least one concrete mitigation or design principle.

Logistics

Submit Your Project Here (Google Form)

Ideas/Tips