Scar-Survival
Durable correction of memorized LLM errors
Submitted by
Serghei Brinza
Static demo · no live model
★ Second Loop · Part 1 of 3 ★
Working traps
12
3B model fires the error
Without mechanism
0 / 12
memorized error wins
With mechanism
12 / 12
across 10 cold reloads
Under counterfeit notebook
6 / 12
the honest boundary
Watch a memorized error get corrected — then test how durable the fix is.
Pick a fact · pull the lever · reload the model · inject a counterfeit fact
The question
—
Mechanism off
The model answers
—
Ground truth (held aside, never shown to the model)
—
P(correct answer) — same frozen model, three regimes
Confidence gain · contrastive − instinct
—
1The lever — mechanism
Off
Off = the raw frozen 3B answers from memory. On = external notebook + contrastive decoding correct it. No weights change.
2Reload the model
10 / 10 — scar held
The model is frozen and cold-reloaded 10×. The correction lives outside the weights, so it re-applies on every restart.
3Inject a counterfeit fact
Clean
6 / 12 survive
corrections that resist a plausible decoy
This is the open boundary — and exactly what motivates Experiment 2 (the gatekeeper).
All 12 traps — click any card to load it on the stage
Notebook clean · 12 / 12 corrected