The compiler works — but it still carries a small, countable set of assumptions about the site it’s compiling, and each one is why a never-seen site can surprise us. This plan makes it a pure replay of what the browser observed, so it generalizes to any publicly-observable site by construction — not by testing more sites, but by removing the places it can behave differently per site. Tonight proved the move: one site’s walk went from 512 failures to 2. This lays that same brick everywhere the code still infers — and installs a ratchet so the assumptions can’t creep back.
Audited tonight from the real code — the gap is smaller than the error-count made it feel.
A sequence of subtractions. Each measured by the assumption-count falling and held-out sites going quieter.
siteId === “digitalempathyvet” report-only flag in two files; an Avada reference in il-certify; a short tail of residual classifiers. Per-item disposition: remove, make data-driven, or legitimate-keep (some branches are about capture vintage, not site type — those stay, cited as such).Designed for, not around.
One number: the held-out blind-spot rate — new rungs per never-seen site — trending to zero across a diverse cohort, with the assumption-count monotonically falling and never rising. That number is the transition. Everything above is a means to it. When a fresh site goes through clean and stops surprising us, the compiler is general — demonstrated, not asserted.