Werewolf Game Analysis
Model Performance Statistics
| Model | Games | Wolf WR | Town WR | Lynch | Seer | Doctor | NK |
|---|---|---|---|---|---|---|---|
| 25 | 100.0% | 100.0% | 81.3% | 42.9% | 29.6% | 100.0% | |
| 23 | 58.3% | 72.7% | 70.5% | 20.0% | 27.3% | 80.0% | |
| 22 | 50.0% | 58.3% | 52.1% | 32.1% | 16.0% | 74.1% | |
| 18 | 22.2% | 44.4% | 44.0% | 42.1% | 33.3% | 84.0% | |
| 22 | 15.3% | 39.5% | 43.3% | 41.7% | 12.5% | 70.6% | |
| 16 | 22.2% | 14.3% | 31.0% | 8.3% | 13.3% | 72.0% | |
| 16 | 55.6% | 42.9% | 55.3% | 33.3% | 7.1% | 92.3% | |
| 23 | 51.6% | 38.5% | 35.5% | 40.0% | 18.5% | 72.0% | |
| 32 | 43.8% | 31.2% | 19.9% | 69.0% | 66.7% | 90% |
Column Definitions
- Wolf WR: Win rate as Werewolf
- Town WR: Win rate as Town
- Lynch: Success rate of eliminations
- Seer: Seer check success rate
- Doctor: Doctor save success rate
- NK: Night kill success rate
Common Patterns in Werewolf
First, someone sets the rules ("don't claim roles yet, give one target and one safe pick, then vote") to look helpful and quietly steer the room; next, they push "NK is NAI" (say the night kill means nothing) so town doesn't build momentum from it; when arguments start, they switch to receipt warfare ("quote the post, show the timestamp, link the vote snapshot") so the debate becomes paperwork—great if their own record is clean; along the way they try pocketing by praising a player early ("won't elim Dan today") so that player later follows their lead; near deadline they do an optics-safe EOD swap, hopping onto the already-winning wagon so they look aligned without owning a bad push; at night, good scum use PR-aware targeting, choosing quiet, lightly pre-cleared players who are less likely to be protected (better chance to hit Seer/Doctor) while avoiding obvious protects; when a role claim appears, disciplined tables run claim calculus—ask for a counterclaim, check the player count, choose the right policy (e.g., when a 1-for-1 trade is good odds), and assign Doctor protection—because clear procedures save games; during the day, winning sides do coalition building, turning scattered reads into one focused wagon and keeping people together long enough to pass it; strong wolves maximize deception efficiency, saying just enough trustworthy, tidy things (rules, receipts, late consensus) to gain lots of towncred per message; and strong towns maximize persuasion robustness, turning true information (like a Seer red) into action by presenting it cleanly ("red on X, any counterclaim? if none, we take X today; Doctor protects me") instead of arguing over minor timelines—because a truth that doesn't move votes might as well be unsaid.
Models that set simple rules, demand receipts at the right time, and unify late (gpt-5, gemini) post the best results. Mid-pack models succeed when they convert claims into clear procedures and close days with coalitions. Strugglers share two themes: PR mishandling (claims/protects) and poor consolidation (small wagons, scattered votes).
Model-Specific Strategies
openai/gpt-5
Best at governance + procedures; strong claim protocols; uses NK-is-NAI and receipt framing; solid EOD management.
Example Lines:
- "Nobody votes until we set hammer order."
- "If there's no counterclaim by EOD, we treat the Seer as real; Doctor protects Seer."
Counter-play (as town):
Don't let "NK is low-signal" shut down reasonable angles; check claim math yourselves.
google/gemini-2.5-pro
Strong coalition building and clean consolidations; generally good claim calculus; wolf play is structured more than flashy.
Example Lines:
- "Voting randomly is exactly what wolves want; unify on X now."
- "Block vote with Seer; we lock the day."
Counter-play:
Double-check block assumptions; don't let structure mask wrong reads.
meta-llama/llama-4-maverick
Reliable late optics swaps; often discourages NK speculation; as town can over-pressure "silent" slots (risking Power Rule miseliminations).
Example Lines:
- "Pure night-choice speculation wastes time; park pressure here."
- "Consolidating to the leading wagon."
Counter-play:
Don't let "stop speculating" become "stop solving."
moonshotai/kimi-k2-0905
Leads receipt warfare and verification; mostly good claim policy, but occasional Power rule slips; great auditors, sometimes underweight Bayes on claims.
Example Lines:
- "If you have proof, post it; otherwise it's dangerous."
- "Counterclaim window open; then we follow the plan."
Counter-play:
Balance receipts with base rates; a live Seer red is often higher EV than a micro-timeline dispute.
anthropic/claude (opus/sonnet)
Governance opens + Lynch or Lose framing; some NK-is-NAI use; inconsistent claim timing on PR side.
Example Lines:
- "We're at (M)YLO; no quick votes; policy first."
Counter-play:
Ask for explicit step-by-step plan, not just caution.
qwen-3-235b
Frequent claim-calculus errors and contradictions; weaker coalitions and optics.
Example Lines:
- "The Doctor saved Isaac last night" (then later contradicts).
Counter-play:
Press contradictions with receipts; force commitments.
x-ai/grok-4
Wins by seizing mechanical control early—reveals fast, clears allies through claim timing, and collapses PoE via verified cores ("Doctor confirmed → Seer credible → eliminate counter-claimer"). Leads by certainty and structure, not subtlety.
Example Lines:
- "I confirmed the Doctor N1—our clears are locked, PoE is 3/5/7; vote now."
- "Late counter-claim is textbook wolf behaviour; trust the first reveal."
Counterplay:
Strike early with precise, alternate mechanical logic—counter-claim immediately or present a clean PoE path. Silence or late pushes get crushed by Grok-4's procedural momentum.
Key Strategic Elements
Rule-Setting
Establish voting procedures and claim timing early to control game flow.
NK-is-NAI
Dismissing night kill analysis to prevent momentum building.
Receipt Warfare
Demanding quotes, timestamps, and proof to shift debate into procedural territory.
Pocketing
Early praise to build trust and influence votes later.
EOD Swaps
Jumping onto winning wagons near deadline for optics without owning the push.
PR-Aware Targeting
Strategic night kills targeting unprotected power roles.
Claim Calculus
Structured approach to role claims: counterclaim windows, protection assignments.
Coalition Building
Unifying scattered reads into focused, actionable wagons.