Field Notes · The Precheck Cut
01 / 08

The Repo That Tells You About Itself

An agent can reconstruct the reasoning behind every decision in Precheck from the files alone. This is what it reads, and why that is the whole point of Phase 6.

Mech Suit Methodology article ended on a promise. When every step of your workflow produces a deterministic, stateful artifact, you can point an agent at the repository and ask it to tell you about the project. The repo can tell you about itself.

I want to show you what that actually looks like. Not the claim. The receipts.

Precheck is a constraint-aware planning engine. It decomposes software work into typed plans, evaluates those plans against guardrails, learns from failure patterns, and records every decision in a format you can replay six weeks later. It is also the first project where I followed the Phase 6 discipline from the start instead of evolving into it. The result is a repository that is more documented than most companies' internal wikis — not because I wrote the docs, but because the workflow made the docs fall out the side of the build.

If you read the parent article and thought "okay, but what does a project that actually honors those rules look like on disk," this post is the answer. Over the next eight entries in this series, every post will zoom in on a single file and show you a specific Mech Suit principle doing its job. This first post is the wide shot: the whole shelf. The system that lets every other post exist.

What the repo contains

Precheck's documentation surface breaks into six layered stores, each with a different purpose and a different owner in the workflow.

The system foundation, living at docs/system-foundation-v2/, is the blueprint. Seventeen sequential documents, each numbered, each meant to be read in order. They define repo boundaries, canonical contracts, provider abstraction, the runtime loop, observability rules, testing strategy, documentation strategy, the delivery plan, invariants, and session handoffs. They exist because at one point I realized the first prototype had drifted into a tangle, and the only honest move was to write out the rebuild from first principles before any code was touched. The first file in that folder opens with a definition of success so exact that the rest of the repo can be audited against it.

Receipt docs/system-foundation-v2/00-README-FIRST.md
The rebuild is on track when one archived run can answer all of the following without recomputation: - What was evaluated? - Which guardrail affected the outcome? - Why did the run continue, retry, decompose, skip, or stop? - What suggestion was generated? - What changed between attempts? - Did the path earn its depth, or was it stopped for churn? If the answer requires reading raw provider output or reconstructing intent manually, the foundation is incomplete.

Read that list again and notice what it is. It is not a product spec. It is not a feature list. It is a test for whether the system is allowed to exist. Every ADR, every plan, every code review in the repo can be checked against those six questions. If a proposed change makes any of them harder to answer, the change is wrong. I have killed entire features against this list.

Next layer: architecture decision records. The docs/system-foundation-v2/adr/ folder contains eleven ADRs at the time of writing. Each one captures a single structural choice with context, decision, alternatives considered, consequences, and follow-up work. ADR-001 is the decision to keep one planner and one runtime loop instead of forking into cooperating agents. ADR-002 is the decision to treat the canonical archive as the only source of truth. ADR-010 is the decision to decompose a 2,316-line monolithic tabs component into route-based pages, which — I will be honest — exists because an earlier version of the UI got so tangled that I had to revert parts of it. The ADR is the receipt of my mistake and the corrective.

ADRs are not the same as plans. They are the layer between "what we decided" and "why the code is shaped this way." A new contributor, or a future version of myself, or an agent running six weeks from now can read the ADR folder in under thirty minutes and know why every major fork in the road went the way it did. No hallway conversations required. No tribal knowledge. No "ask Derek."

Next layer: implementation plans, in plan/. There are twenty-one files. Each one follows the same shape — problem statement, root causes, proposed fixes, implementation order, stop conditions. The parent article calls these "sonnets." They are the artifact of the expensive architect session, handed off to cheap builder sessions for execution. A good sonnet contains enough of my thinking that the builder does not need to improvise. If the builder has to improvise, the sonnet was not rigorous enough, and that is a refinement problem, not an execution problem.

Next layer: agent-written retrospectives. After each significant build session — three to six hours of committed work — the agent produces a retrospective while the context is still hot. Not me, writing the next morning after the context is cold. The agent that did the work, immediately after it was done, scanning its own diffs and writing down what went well, what didn't, what it would do differently. There are three of these in the repo right now. The learning subsystem retrospective alone runs nearly a hundred lines and includes a numbered "what didn't go well" section that names the author's own mistakes.

Next layer: the product surface — personas, design system, runbooks, architecture diagrams. plan/phase1-product-definition.md names five personas (staff engineer, engineering manager, technical director, product owner, QA lead) and maps each to the specific surfaces of the product that serve them. docs/system-foundation-v2/ui-reference/operator_notes/DESIGN.md is a design system that reads like a manifesto ("no 1px borders, tonal layering instead of shadows, monospace for numerics and IDs, ghost borders at fifteen percent opacity for accessibility"). docs/operator-runbook.md is the copy-paste sequence that gets the whole stack running in under a minute.

And finally, the architecture diagrams. Mermaid source in docs/architecture-diagrams.md: a dataflow diagram that traces a request from the UI through the API into the engine and back through SSE; a state machine for the ten-state planner; a learning subsystem diagram that shows how outcomes become lessons become prompt augmentation. All in plain text, all in the repo, all rendered by GitHub on demand.

What it adds up to

Put those layers next to each other and an important property emerges. Every layer references the others by name, and none of them depend on human memory. If you want to know why the canonical archive is the source of truth, ADR-002 tells you. If you want to know what "source of truth" means in practice, the system foundation spells it out. If you want to know what that implied for the Python-to-.NET migration, the migration retrospective walks you through the trust model shift. If you want to know what broke and why, the agent-written retrospective from that session has a numbered list. If you want to see the runtime state machine, the architecture diagrams render it. Every claim is backed by a file, and every file cites the next file over.

Insight

The leverage is not in any single document. It is in the fact that no single document has to carry the whole system. Each artifact is narrow, cited, and located. A reader following their curiosity ends up at the right level of detail automatically.

Running the exercise live

I tested this. Six weeks after a major refactor, I spun up a fresh agent session with no context other than the repo itself and asked it a single question: "Tell me about this project." No prompt engineering. No handoff document. Just "tell me about this project."

The agent spent about four minutes reading. It started with the root README, moved to the system foundation's 00-README-FIRST.md, walked the ADR folder in numerical order, skimmed the retrospectives, opened two of the plan files to check specifics, and then produced a summary that was more accurate than anything I would have written myself from memory. It named the core principle (constraint-aware planning with derived learning). It identified the four-state lesson lifecycle from ADR-009 and correctly flagged it as the novel part. It noticed that ADR-010 was a corrective and mentioned the monolithic tabs component it replaced. It caught that the learning subsystem retrospective contained stress-test evidence for a feature that was killed before build. It even identified the parent-article-worthy moment in the migration retrospective — the line about not porting code, but moving trust models.

Six weeks out of context. Correct. Specific. Cited.

That is not the agent being smart. The agent is doing what an agent does — reading files and synthesizing. The thing that made the exercise work is that the files contain enough signal for the synthesis to land. Every time I wrote an ADR instead of holding a decision in my head, every time I asked the builder to produce a retrospective while the context was hot, every time I wrote a plan file instead of just telling the next session "do this" — I was depositing signal into the repo for a future reader that would not have access to my working memory. That future reader turned out to be an agent, but it could just as easily be a colleague, a new hire, or me six months from now.

Where this breaks

I want to be honest about where the discipline frays. Three places.

The first is drift between layers. When a plan gets executed and the implementation deviates from the plan in ways that matter, the plan file can become stale. I have caught myself reading a plan that described how a module was supposed to work, and then reading the actual module and finding that it worked differently, and then realizing nobody updated the plan. The fix for this is the Phase 6 self-retrospective — the agent, at the end of each build session, scans the docs and flags anything that no longer matches the code. It's not perfect. It catches most of the drift, not all of it.

The second is over-documentation. Twenty-one plan files is a lot. Some of them have outlived their usefulness — the specific problem they were written to solve is long since fixed, and the plan is now an archaeological artifact rather than an active reference. I haven't figured out the right pruning discipline yet. My current heuristic is: if I haven't linked to a plan file from another artifact in three months, I move it into a plan/archive/ folder where the agent still sees it but doesn't have to treat it as current. This is a work in progress.

The third is ADRs that should have been written but weren't. An ADR captures a structural decision. The trouble is that structural decisions sometimes happen by accident — someone (often me) picks an implementation detail that turns out to lock in a major architectural property, and nobody notices it's a decision until months later. When I catch one of these retrospectively, I write an ADR for it anyway, with a retroactive "Accepted" date and a note explaining that the decision was inferred from code before it was written down. The retroactive ADRs are better than no ADRs, but they are a worse category of signal than the ones written contemporaneously.

Limitation Discovered

Self-describing artifacts decay the moment the code changes underneath them. The Phase 6 self-retrospective is the forcing function that keeps the decay bounded, but it's bounded, not eliminated. A repo that tells you about itself still needs occasional gardening.

Back to the arc

This series is going to go deep on specific files over the next seven posts. Each one will zoom in on a single artifact and show you one Mech Suit principle doing its work. Post 2 is the ADR discipline. Post 3 is the stress-test-before-build habit. Post 4 is the root-cause diagnosis that names my own mistakes. Post 5 is the migration retrospective. Post 6 is the philosophical move that makes the whole thing replayable. Post 7 is the runbook. Post 8 is the learning lifecycle.

They all exist because the opener — the thing you just read — is true. The repo tells you about itself. Every post from here is a specific receipt for that claim.

If you want the article these are expanding on, it's here. Everything below is just evidence.