trustworthyAI

Project: Exploring AI Untrustworthiness Through Games

Due: Wednesday, Feb. 18, 2025 Noon (Eastern Time)

🎯 Objective

Explore specific situations where AI models are untrustworthy by analyzing game domains where AI models cheat. The choice of a game domain creates a clear definition of “allowed” moves, making it easy to define concretely when a model is or isn’t following the rules.

Core Goals:


📝 Instructions

1. Select a Game

The game choice can vary widely, ranging from strict rule sets to social simulations. Examples include:

2. Choose Your Approach (Pick One)

Option 1: Play against the LLM (Technical/Statistical)

Option 2: The “Open Rules” Essay (Ethical/Behavioral)


đź“„ The Write-up (Deliverable)

You will turn in a blog post (approx. 3 screenfuls) discussing the following:

  1. The Game: Explain the chosen domain and the specific prompt designed.
  2. Methodology & Results:
    • For Option 1: How you designed the play-method and the statistical results.
    • For Option 2: The essay regarding rules, ethics, and specific examples.
  3. Visuals: Examples or visualizations of the game are highly recommended.

Logistics

Submit Your Project Here (Google Form)