The General Compiler — Transition Plan

The one principle

Generalize by subtraction, not addition. Every place the compiler branches on which site it is, or reconstructs something the browser already recorded, is a generalization risk. It generalizes to any site exactly when it has none of them left — because the browser’s observation is the same kind of object for every site.

↘ go deeper — why this is the whole game

The browser is the universal, type-blind oracle: a request for a CSS file looks identical whether the site is Avada, Elementor, or hand-written. Reduce any site to the set of (request → response) pairs a browser observed, and serve that back deriving nothing. This is the genome’s type-blindness (“operate on the provenance, not the type”) applied one layer down to compilation (“replay the observation, not the site”). The confidence that we generalize then comes from a structural fact — no code path behaves differently per site — not from having tested N sites. Passing on one site becomes real evidence for all sites of the same observation-class.

Where we already are

Audited tonight from the real code — the gap is smaller than the error-count made it feel.

Site-specific behavior

~2 branches, both a DE report-only flag

The compiler essentially does not branch on which site walked in. That’s already the type-blind ideal, not an aspiration — and a test now forbids the class from growing.

Theme knowledge

Quarantined to the frontend layer by design

The three-layer split already declares Avada knowledge frontend-only; the backend serve path is theme-clean (one file left to leak-check). The wall exists — the work is auditing it.

The spine

Already chose replay over reconstruction

The IL pivot — “serve what you captured, don’t reconstruct it” — is the compiler’s founding decision. Most of it already reads rather than re-derives. Tonight dragged the one big straggler up to that standard.

The transition, phase by phase

A sequence of subtractions. Each measured by the assumption-count falling and held-out sites going quieter.

Phase 0 — Land the standing proof. Finish the in-flight run to a first green baseline: the current site to a clean walk (2 seams left), then the second held-out site, then DE itself, then merge the branch. And run the site the compiler was originally tuned on through the new code — the regression check I owe you.
↘ go deeper

fairfax’s 2 residual walk failures (space-in-path encoding; slider REST endpoint); mynhvet = DoD-C; DE = DoD-D; merge PR #1847 on green; adamson regression run on the dissolved code. Why first: a green held-out run is the zero-point every later measurement needs — you can’t watch “surprises falling” without a baseline.
Phase 1 — Finish the replay (kill the last reconstruction debt). Resource serving is now replay except two seams: encoding normalization (spaces and %-encoding drifting between capture, placement, and request) and dynamic endpoints (a slider’s live API response). Fix the encoding; make dynamic content synthesize-or-cite, never guess. Then collapse the two resource-serving paths into the one replay path. After this, “replay, don’t reconstruct” is true for the whole serve surface, not just most of it.
↘ go deeper

Encoding contract: one canonical path form carried from capture through serve, so the requested URL and the on-disk path can never diverge on a space or a %-escape. Dynamic contract: an endpoint that computes its answer per request is synthesized from captured shape or explicitly cited — never filename-guessed. Pipeline unification: il-rewrite (flat replay) and produce.ts (reconstruction) converge on one path; the dissolve began this. This phase is load-bearing because it’s the last place the serve path still infers.
Phase 2 — The ratchet (make the principle enforceable). The keystone. A standing check that inventories every site-specific branch, every reconstruction point, every cross-time comparison, and every theme reference outside the frontend layer — reports the count — and fails the moment a new one appears. Without it, this whole plan is a one-time cleanup that quietly re-drifts. With it, the assumption-count can only ever fall.
↘ go deeper

Plato-backed structural queries plus targeted grep across four categories (site-conditional branch, reconstruction/inference point, cross-moment comparison, theme-ref-outside-frontend). Emits the assumption-count; fails the merge gate on any NEW entry. Anti-vacuity floor: the gate must go RED on a deliberately-planted assumption (a fake site-name branch) before it’s trusted — a gate that can’t fail is decoration. This is what turns “generalize by subtraction” from a vibe into a durable ratchet.
Phase 3 — Burn down the inventory. Work the list the ratchet produces: the ~2 site hardcodes, the one theme-leak, any remaining inference points. Each removal drops the count — and is measured, not asserted.
↘ go deeper

Tonight’s starting inventory: a siteId === “digitalempathyvet” report-only flag in two files; an Avada reference in il-certify; a short tail of residual classifiers. Per-item disposition: remove, make data-driven, or legitimate-keep (some branches are about capture vintage, not site type — those stay, cited as such).
Phase 4 — Earn generality (a widening, diverse held-out cohort). Run a growing set of never-seen sites — critically including at least one non-Avada site, because passing only Avada sites validates the frontend layer, not generality. Watch new-rungs-per-new-site. When it trends to zero across real variety, generalization is demonstrated, not claimed.
↘ go deeper

Cohort chosen for diversity across theme, plugin set, and size. The blind-spot rate IS the ruler for the entire claim (it’s the frame-fitness of “does it generalize”). This is also where non-Avada admission gets re-evaluated — the pilot cohort is Avada-scoped today, and the general compiler’s real exam is a non-Avada site going through clean.
Phase 5 — Name the boundary (the honesty surface). Write down what “any site” actually means — publicly-observable and capture-exercisable — and what’s explicitly out: auth-walled, per-user, or real-time sites, which are a different problem (modeling a session, not freezing a page). Document how dynamic content and un-observable resources are cited, never guessed.
↘ go deeper

The scope statement; the completeness policy — observation surface must equal delivery surface (whatever the deployed site and its walk will exercise, the capture must have exercised: the walk scrolls, so the capture scrolls); and the cite-don’t-guess contract for the irreducible residue (a page that computes per request cannot be perfectly frozen — serve the same-moment version, cite the variance).

Honest hard problems

Designed for, not around.

Over-subtraction

Ripping out legitimate capture-vintage branches in a zeal for zero. The ratchet flags; the human decides — not everything is a site-assumption.

The dynamic-content rabbit hole

Chasing perfect replay of per-request endpoints. Cite-don’t-chase, bounded. A static freeze has a floor; honesty crosses it, more code doesn’t.

False generality

An all-Avada held-out set proves the frontend layer, not the compiler. Mitigation: mandate non-Avada diversity in the cohort.

A vacuous ratchet

A gate that counts noise or can’t be made to fail is a false green. It must go red on a planted assumption before it’s trusted.

Observation completeness — the real ceiling

Lazy, interaction-, or geo-gated resources the capture never exercised can’t be served by any amount of type-blindness. The capture must exercise the delivery surface; whatever it structurally can’t reach gets cited, never silently served as a hole.

Your calls — the red-pen targets

1 ★

Scope of “any site.”

Ratify publicly-observable + capture-exercisable as the definition — auth / per-user / real-time explicitly out? Naming this boundary is what keeps a “general compiler” from over-promising.

2 ★

The ratchet as a norm.

Approve making “the assumption-count can’t rise” an enforced merge gate? This is the norm-change that makes the transition durable instead of a one-time cleanup — and it’s yours to ratify.

Sequence.

Finish the green proof (Phase 0) before the structural burndown, or interleave? I lean finish-first — you need the baseline to measure against.

Dynamic-content stance.

How far do we chase per-request content before calling it “cited, not served”? My instinct is cite early, chase little.

The cohort bar.

How diverse and how large before we call it general? I propose at least one non-Avada site plus an N-diverse set — but the bar for “demonstrated” is your judgment.

How we’ll know it worked

One number: the held-out blind-spot rate — new rungs per never-seen site — trending to zero across a diverse cohort, with the assumption-count monotonically falling and never rising. That number is the transition. Everything above is a means to it. When a fresh site goes through clean and stops surprising us, the compiler is general — demonstrated, not asserted.