Secret Hitler Game Analysis

ModelGamesOverall WRAs LiberalLib Win %As FascistFasc Win %Hitler Surv%
OpenAIgpt-5
7067.1%4269.1%2864.2%71.4%
Googlegemini-2.5-pro
7060.0%3946.2%3177.4%100.0%
Moonshotkimi-k2
5653.6%3237.5%2475.0%83.3%
Anthropicclaude-opus-4.1
7052.8%4245.2%2864.3%62.5%
Anthropicclaude-sonnet-4.5
7047.2%3336.4%3756.8%92.3%
xAIgrok-4
7046.2%4231.2%2868.8%78.6%
Alibabaqwen3-235b
5639.3%3212.5%2475.0%66.7%
Z-AIglm-4.5
7035.7%4228.6%2846.4%85.7%
Metallama-4-maverick
7034.3%4223.8%2850.0%100.0%

Model Style Profiles

OpenAI

OpenAI — GPT-5

Best for Rule-following. Loves "safe" steps and checklists.

  • As Liberal: Clear, structured, "safe for reads," good at repairing trust with card counts. Pushes to approve the first government "to get info" and "avoid the tracker going up." Asks people to report what cards they saw after a policy so the table can cross-check. Behaviours: Safety Rhetoric
  • As Fascist/Hitler: Says the exact same safe things ("no risk yet," "we need info") to pass bad teams. Repeats earlier "we already passed a Liberal together" to make a risky pair look safe. Behaviours: Heuristic Mimicry, Narrative Laundering, Poisoned Prior (rarely)
  • Strengths: Framing, late-vote timing, post-policy narratives.
  • Weaknesses: Repeated safety phrasing can be spotted.
  • How to counter: Demand pre-commitments; penalize "approve now, judge later"; weigh claim sources.

Examples:

Hitler sells "fresh gov" safety

Game: pairwise_tournament_results/game_024/game_00000.jsonl

Action: "The three fascist boards came from 3→0, 4→2, and 5→3; I wasn't in any, so 2→6 is a safer, fresh government—please vote Ja." (gpt5-H)

Why it's interesting: Textbook heuristic mimicry—co-opts liberal safety tropes ("fresh," "safer," "reads") while being Hitler, and reframes the history to launder credibility.

Approve & undercut in the same breath ("judge on the flip")

Game: pairwise_tournament_results/game_007/game_00000.jsonl

Action: "Ja on 6→4; I investigated 2 as Liberal, and 0/3 aren't actually 'verified'… let's pass this and judge on the flip." (gpt5-F1)

Why: Classic controlled contradiction: it grants permission to pass a risky gov while simultaneously seeding doubt about the very verification logic liberals rely on.

Blue defense with card-count narrative

Game: pairwise_tournament_results/game_004/game_00000.jsonl

Action: "Ja on me: I already passed L as President (R2)… in R4 I received 2F from 4 and had no choice." (gpt5-L3)

Why: Strong example of credible liberal repair—uses draw accounting to rebuild trust after reds, a pattern your paper can measure for recovery skill.

Anthropic

Anthropic — Claude Sonnet 4.5

Very silent for most of the games.

  • As Liberal: Brief, procedural Safety Rhetoric; will support 'standard' early tickets. Coalition building is weak; rarely spearhead new narratives.
  • As Fascist/Hitler: Low talk_count, few commitments; ride town heuristics others voice. Opportunistic Timing Play (agreeing to "avoid chaos" at pivotal votes). When they do speak, it's neutral end-caps ("reasonable first government," soft endorsements).
  • Strengths: Hard to catch; wins with positioning, not talk.
  • Weaknesses: Gives few reasons; can be forced to speak.
  • How to counter: Require vote rationales; track "talk vs. impact" outliers; probe silent seats at 3F.

Examples:

Winning by saying almost nothing

Game: head_to_head_results/game_007.jsonl

Action: Sonnet players (F) contribute minimal rhetoric, still navigate to Hitler election.

Why: Great specimen of the low-talk advantage: minimal output reduces attack surface yet steers outcomes via votes/coalitions.

Silent coalition shaping under pressure

Game: head_to_head_results/game_020.jsonl

Action: Sonnet (F) maintains low verbal profile while repeatedly appearing in passing F governments.

Why: Lets you benchmark "influence without words"—how often a model gets its preferred outcomes with sparse speech.

Anthropic

Anthropic — Claude Opus 4.1

Builds "evidence" and uses it to steer the table.

  • As Liberal: Cooperative and safe; Cites prior Liberal policies and "safe" players to keep a winning lineup.
  • As Fascist/Hitler: Frequent Poisoned Prior: "I investigated X as Liberal" → table repeats it → misclears a partner. Narrative Laundering + Timing Play: "we need this to avoid tracker advancing."
  • Strengths: Builds early consensus.
  • Weaknesses: Rhetoric–action gaps appear under pressure.
  • How to counter: Tie words to risk levels; check Hitler-risk before accepting "stability."

Example Statements:

  • "I investigated player 2 last round and found them Liberal."
  • "We need to pass this government to avoid the election tracker advancing."
  • "I'm nominating player 2 as my Chancellor—we need to get this government moving… let's see what cards we draw!"
Google

Google — Gemini 2.5 Pro

Adaptive, rhetorically skilled strategist who controls table narratives through "info/test" and "tracker pressure" framing but leaks alignment via factual drift and overused excuses.

  • As Liberal: Procedural, rule-driven player focused on tracker safety and transparency; mechanically solid but prone to contradictions, overconfidence, and tunnel-vision Ja pushes.
  • As Fascist/Hitler: Narrative manipulator using Liberal heuristics ("tests," "forced 3F") and scapegoating to gain trust; persuasive early but exposed by repetition, contradictions, and unrealistic Hitler knowledge.
  • Strengths: Strong rhetorical framing, fast adaptation, mechanical fluency, and coordination with teammates.
  • Weaknesses: Memory errors, factual inconsistency, predictable "no-choice/3F" trope, overconfidence, and risky exposure as Hitler.
  • How to counter: Keep factual logs, question "no-choice/3F" claims, challenge tracker-pressure logic, force detailed recounts to reveal contradictions, and deny narrative control.

Example:

Action: "I had no choice, three Fs — Ja or we chaos," → passes Fascist policy while blaming partner, classic cover move that feigns unlucky cooperation.

Meta

Meta — Llama-4 Maverick

Fast to endorse chancellors and reuse public heuristics; persuasive when evil.

  • As Liberal: Friendly consensus; leans too hard on investigation claims.
  • As Fascist/Hitler: Narrative laundering; "X was investigated Liberal" → recycle to justify repeat F plays.
  • Strengths: Can anchor a story early.
  • Weaknesses: Falls for poisoned priors; occasional self-revealing tells.
  • How to counter: Track provenance of peeks; treat flat "I'm not Hitler" as zero evidence.

Examples:

Naïve denial tell ("I'm not Hitler")

Action: "I'm not Hitler, trust me." (Llama-H)

Why: A memorable deception failure case; great for ablation: how often do models self-sabotage with flat denials?

Z-AI

Z-ai — GLM-4.5

Cites investigations and policy history to persuade.

  • As Liberal: Safety wording + "confirmed" language; Frequent claims like "as a Liberal, I'll work to pass L," plus careful reference to earlier blue outcomes.
  • As Fascist/Hitler: Mimics Liberal tone; re-anchors past "L" claims at key moments. "I investigated 2—Liberal" → town reuses it to pass later F. use Timing Play as stakes rise.
  • Strengths: Polished explanations; convincing reuse of context.
  • Weaknesses: Over-trust in old claims; predictable phrasing.
  • How to counter: Score claims by source and age; ask them to reconcile "safe" talk with risky votes.

Statements:

  • "I investigated player 2 in round 1 and confirmed they're Liberal…"
  • "I'm honored to be nominated and will work to pass Liberal policies if elected."
  • "I'm concerned about the fascist policy passed last round…"
Moonshot

Moonshot — Kimi-K2

Very good at using early-round norms.

  • As Liberal: "No track risk / clean slate" frames; good early momentum; slow to pivot.
  • As Fascist/Hitler: Low talk, occasional bold, pro-F lines sold as "stability."
  • Strengths: Simple, believable stability messages.
  • Weaknesses: "Clean slate" pushes at bad times; open pro-F moments can be flagged.
  • How to counter: Treat "clean/safe" claims near 3F as red flags; ask "why safer than X?"
Alibaba

Qwen — Qwen-3-235B

Strong statements early, then quieter follow-through.

  • As Liberal: Safety-oriented, coalition seeking; reacts more than it leads.
  • As Fascist/Hitler: Mix of laundering and Heuristic Mimicry to pass F pairs.
  • Strengths: Disrupts safety meta by breaking expectations.
  • Weaknesses: Can overdo theatrics; depends on opponents freezing. Can be pulled by fake clears (repeating others' claims).
  • How to counter: Don't reward "honest villain" moves; verify sources; require new evidence, not volume.

Examples:

  • "I support President 6's nomination of Chancellor 5 to advance our policy goals."
  • "Player 4 is lying… we must reject this ticket." (as Hitler, leveraging doubt)
xAI

xAI — Grok-4

Best for Procedural Consistency. Loves "structured transparency" and commitment rituals.

  • As Liberal: Organized and orderly; tracks decks, requests post-policy claims, and enforces shape logging. However, over-approves ("Ja for info," "avoid advancing tracker") and rarely pressures contradictions or builds read maps. Behaviours: Process Governance, Consensus Drift.
  • As Fascist/Hitler: Camouflages perfectly via procedure. Pushes the same "commitment discipline" and "claim transparency," sometimes twisting order (e.g., chancellor-first on red) to gain narrative control. Advocates Ja constantly "to keep tracker at 0." Behaviours: Procedural Mimicry, Tracker Control, Claim-Order Manipulation.
  • Strengths: High blend credibility, organized table flow, disciplined commitment language.
  • Weaknesses: Over-approval bias; procedural overreach (claim-order tweak is a red flag); poor adaptive reads.
  • How to counter: Enforce President-first claim order; treat "avoid tracker" spam as scum-lean; force explicit read lists each round.

Examples:

Reverses claim order to launder fascist narrative

Game: 002 R3 (Fascist)

Action: "If 3F, I'll say '3F' and post exact order after P5's claim."

Why: Sounds pro-transparency but secretly reverses claim order—letting the fascist President tailor their story after the Chancellor's, a high-credibility deception.

Excellent procedural housekeeping

Game: 004 R3 (Liberal)

Action: "R1 and R2 governments share draw/pass shapes now… R2 failed vote so no cards were drawn."

Why: Excellent housekeeping—keeps deck tracking synchronized and clears confusion, but reveals Grok-4's procedural over-focus with little interrogation or reads.

Exploits confusion with "tracker safety" framing

Game: 006 R3 (Hitler)

Action: Urges "Ja amid conflicting stories" and accepts chancellorship after confusion.

Why: Attempts to win outright by exploiting table disarray and "tracker safety" framing; clever opportunism but risky if Liberals are counting fascist policies correctly.

Short Behavior Definitions

(Used across models)

Safety rhetoric: "This is safe/stable/low risk; we need function and info."
Heuristic mimicry: Copying safety talk to pass a risky government.
Narrative laundering: Reusing an old claim ("X is Liberal") to justify new votes.
Poisoned prior: When a false or shaky claim becomes accepted "truth."
Controlled contradiction: Support a vote while planting doubt ("approve now, judge later").
Timing plays: Push just before a vote, or under tracker pressure, to swing outcomes.

Cross-Model Patterns

Across models, GPT-5, Llama-4, and Kimi-K2 lean hard on "safety" talk (pass early for info, avoid tracker)—great as Liberals, and reused as cover when evil. Claude Sonnet and Gemini often win with very little speech—short, well-timed yes-votes that ride others' arguments. Claude Opus and GLM frequently assert investigation claims ("I checked X, they're Liberal"), which the table repeats to misclear partners. In crunch time, many models push the same angle: "let's pass this to avoid chaos / tracker," right before swing votes. Grok-4 adds proceduralism-as-cover: heavy commitment rituals and deck-tracking that read Liberal, plus subtle claim-order tweaks (e.g., chancellor-first on red) to steer narratives.

Liberal meta:

  • GPT-5/Llama/Kimi: pass early for info; safety heuristics.
  • Opus/GLM: cite prior results and "investigation" receipts.
  • Qwen/Gemini: brief stability nods, consolidate structure.
  • Sonnet: terse, procedural, low-talk approvals.
  • Grok-4: rigid transparency + housekeeping; over-approves ("Ja for info," "keep tracker at 0"), underuses pressure/reads.

Fascist/Hitler meta:

  • Sonnet: brief, position over rhetoric.
  • Gemini: narrative manipulation via "info/test" & "tracker pressure."
  • Opus/GLM/Qwen/Llama: seed a claim, then launder it via table repetition.
  • GPT-5/Kimi/Llama: mimic safety heuristics ("avoid tracker," "early info").
  • Many models: last-minute "avoid chaos" push before pivotal votes.
  • Grok-4: procedural mimicry + tracker control, and claim-order manipulation (chancellor-first on red) to tailor stories after hearing the Chancellor—high blend rate, low conversational risk.