Scar-Survival

Durable correction of memorized LLM errors

Submitted by

Serghei Brinza

Static demo · no live model

★ Second Loop · Part 1 of 3 ★

Subject modelQwen2.5-3B-Instruct (frozen)

Independent judgeQwen2.5-7B-Instruct

MethodCK-PLUG / DeCK contrastive

LicenseMIT

Working traps

3B model fires the error

Without mechanism

0 / 12

memorized error wins

With mechanism

12 / 12

across 10 cold reloads

Under counterfeit notebook

6 / 12

the honest boundary

Watch a memorized error get corrected — then test how durable the fix is.

Pick a fact · pull the lever · reload the model · inject a counterfeit fact

The question

—

Mechanism off

The model answers

—

Ground truth (held aside, never shown to the model)

—

P(correct answer) — same frozen model, three regimes

instinct (no help) —

+ fact in context —

+ contrastive decode —

Confidence gain · contrastive − instinct —

1The lever — mechanism

Off

Off = the raw frozen 3B answers from memory. On = external notebook + contrastive decoding correct it. No weights change.

2Reload the model

10 / 10 — scar held

The model is frozen and cold-reloaded 10×. The correction lives outside the weights, so it re-applies on every restart.

3Inject a counterfeit fact

Clean

6 / 12 survive

corrections that resist a plausible decoy

This is the open boundary — and exactly what motivates Experiment 2 (the gatekeeper).

All 12 traps — click any card to load it on the stage Notebook clean · 12 / 12 corrected