Field Notes · The Precheck Cut
07 / 08

The Runbook Is the Product

When you're running three to five parallel builder pipelines, you can't afford to remember how to start each project. The runbook is the developer experience. Here is how Precheck's command surface was designed around operator cognitive load.

Eight commands. That is the entire operator surface for Precheck. Eight words, each one the name of a specific thing an operator might need to do, and each one designed so that running it in the wrong order produces a helpful error instead of a silent failure.

The command surface is not an afterthought. It is what the project feels like from the outside.

This post is the odd one in the series. It is not about constraints, or learning systems, or trust models, or stress-testing. It is about shell commands. Specifically, it is about the eight commands in scripts/dev.ps1 that constitute Precheck's entire developer experience, and why those commands are the right answer to a problem the parent article only hinted at. Phase 5 in the Mech Suit Methodology describes running three to five builder pipelines in parallel. When you are running three to five pipelines, every project you touch has to be cheap to start, cheap to check, and cheap to stop. If you have to remember how to run each one, you cannot run five. You will run one, carefully, while the other four sit cold. The runbook is what lets the parallel discipline exist at all.

The eight commands

Here is the command surface, in full, lifted straight from the top of scripts/dev.ps1.

Receipt scripts/dev.ps1
param( [ValidateSet("up", "import", "smoke", "status", "logs", "down", "nuke", "doctor", "endpoints")] [string]$Command = "status" )

Nine commands if you count the default status, or eight if you consider status the fallback rather than a separate verb. Either way — a small number. Each one is a single English word. Each one describes an operator intent in a way that does not require thinking about the underlying mechanism.

That list is not innovative. Every dev tool in existence has some version of these commands. What is specific to Precheck is the discipline about what is not in the list. There is no build command — building happens implicitly inside up, because an operator running up wants a working stack, not a successful build. There is no test command — testing happens inside smoke, because an operator running smoke wants to know if the stack actually works, which is a different question from "do the unit tests pass." There is no restart command — a restart is down followed by up, which is two commands that the operator already knows, and having a third command for the combination would just be one more thing to remember.

The discipline is: each command is one thing the operator might want to do, and each command should not have to be paired with another command that is "obviously" required. If up needs a working Docker daemon, up should check for that and fail with a helpful message. It should not require the operator to run doctor first. doctor exists for when the operator wants to proactively check the environment, not as a prerequisite for up.

The order the verbs teach

There is a deliberate progression built into the verbs. Not enforced by code — the operator can run them in any order. But the names are chosen so that the natural reading order is the right order for a first-time user.

The operator runbook demonstrates this directly. Here is the "standard daily flow" copy-paste sequence from the runbook.

Receipt docs/operator-runbook.md
## Standard Daily Flow powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 doctor powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 up -ProjectName precheck-verify -ApiPort 18100 powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 import -ProjectName precheck-verify powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 smoke -ProjectName precheck-verify powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 status -ProjectName precheck-verify

Read the verbs in order: doctor, up, import, smoke, status. That sequence is a story. Check the environment. Start the stack. Load the seeds. Confirm it's working. Show me where things are. Each step is a checkpoint. If any one of them fails, you know exactly where the failure is, because each step only does one thing.

The "fast restart" flow is shorter:

Receipt docs/operator-runbook.md
## Fast Restart (No Data Reset) powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 down -ProjectName precheck-verify powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 up -ProjectName precheck-verify -ApiPort 18100 powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 smoke -ProjectName precheck-verify

Down, up, smoke. Three steps. That is what "fast restart" should feel like — a sequence short enough to type from memory after the third time, with a verification step at the end so you never walk away thinking the stack is up when it isn't.

And the destructive flow uses a deliberately ugly verb, nuke, so there is no ambiguity about whether it is safe to run:

Receipt docs/operator-runbook.md
## Full Reset (Destructive) powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 nuke -ProjectName precheck-verify powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 up -ProjectName precheck-verify -ApiPort 18100 powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 import -ProjectName precheck-verify powershell -ExecutionPolicy Bypass -File .\scripts\dev.ps1 smoke -ProjectName precheck-verify

The word nuke is ugly on purpose. down stops the stack. nuke stops the stack and destroys the database volumes. The distinction matters, and the naming is designed so that an operator who types nuke cannot claim they did not know what it would do. If the verb had been clean or reset, it would have sounded safe, and someone — probably me — would eventually have destroyed data by accident. nuke sounds exactly as dangerous as it is.

Insight

The verb is the first line of defense against operator error. If an operator is going to be cavalier with a command, the name of the command should make cavalier use feel wrong. Name destructive things destructively. Name safe things safely. The name is part of the interface.

What the runbook replaces

I want to tell you what this replaces, because the shape of the replacement is the whole lesson.

Before the runbook existed, starting Precheck looked like this: open a shell, cd into the repo, run docker compose up, wait for the containers, open a second shell, cd into the UI folder, run npm run dev, wait for Vite, open a third shell, run a Python import script against the just-started API, then open a fourth shell and tail the docker logs manually. Every shell needed a specific environment variable set. Every shell needed to be in a specific directory. If any one of them errored, the operator had to diagnose which shell was in which state and which dependency had failed.

Every one of those steps was something I had to remember. Not because they were complicated individually, but because I was running five projects at the time, and each one had its own specific incantation. Precheck's incantation was not the same as my other project's incantation, which was not the same as the third project's incantation. Switching between projects meant a context-switch cost of about three minutes just on the startup sequence, and that cost compounded — every time I had to restart a stack, the cost came back.

Three minutes per restart does not sound like much. Multiply it by three to five parallel pipelines, plus the fact that I restart stacks multiple times a day during development, and the cost becomes real. An hour a day vanishing into "remembering how to start things" is not a reasonable tax on parallel work. The runbook cuts that to seconds. The cost is that I have to write and maintain the runbook, and write and maintain the script it drives. The benefit is that the startup sequence is now cheap enough that I actually do run all five projects instead of letting four of them go cold.

This is what "the runbook is the DevEx" means in practice. It is not that the runbook is a nice addition to the DevEx. It is that the runbook is the operator's entire mental model of the project. If the runbook is clean, the project feels clean. If the runbook is messy or missing, the project feels like friction, and friction accumulates until the project stops being touched.

The property I want you to notice

Look at the standard daily flow again. Every line is copy-pastable. There is no "edit this file first." There is no "set this environment variable." There is no "make sure Docker is running" (the doctor command checks). There is no "make sure the ports are free" (the script picks free ports automatically if you don't specify them). There is no "wait until you see this message in the logs." Every line is a terminal command that can be run unmodified, and every line prints something useful when it finishes.

The property this buys is that the runbook is repeatable without thinking. An operator who has never seen Precheck before can follow the runbook from top to bottom and end up with a working stack. They do not have to understand Docker. They do not have to understand the .NET project layout. They do not have to know what Postgres is. They just have to type what the runbook says. And the runbook is short enough to fit on a single screen.

This is the same discipline as the ADR rule from post two, applied to operator verbs instead of architectural decisions. An ADR is a self-contained explanation of a decision that the next session can read and act on. A runbook command is a self-contained action that the next session can execute. The unit of reuse is a small, named, clean thing that does not require the operator to hold surrounding context in their head.

And in both cases, the discipline exists because the alternative is that the operator — almost always a future version of me — is going to have to reconstruct the context from memory, under time pressure, while also trying to do the thing the context was for. Reconstruction is expensive. The runbook is cheap. You pay the cost once, upfront, while the context is hot. You get the benefit every time you come back to the project cold.

Unlock

Parallel builder pipelines do not work if the startup cost per project is high. The Phase 5 capability from the parent article — three to five concurrent builds — depends on every project having a cheap-to-run, easy-to-remember command surface. The runbook is not a convenience. It is the prerequisite for parallelism.

Where I got this wrong first

The runbook did not exist when the project started. Precheck went through a phase where I had build scripts named things like fresh.ps1 and build.ps1 and run_intake.py and a dozen other ad-hoc commands that each did part of the job. The migration retrospective quoted in post five has a line about this:

Receipt docs/migration-retrospective-2026-03-28.md
We burned time on too many entrypoints: compose scripts, ad hoc PowerShell, Vite dev server, Task, and wrapper scripts. The simplification that mattered was not "remove every script." It was "make one command the obvious way to run the fresh slice."

"Make one command the obvious way" is the pattern. Not "have one script." The repo has more than one script — dev.ps1 handles the compose-managed flow, and there is a separate fresh.ps1 for the ephemeral demo slice. But each script has exactly one purpose, and each purpose has exactly one obvious command. The operator does not have to choose between three ways to start the stack. They have to pick which flow they are in (compose-managed production-like, or fresh local slice), and then the command for that flow is obvious.

The error I was making before was confusing "many tools" with "flexibility." I thought a project with more entrypoints was a project with more capabilities. It was actually a project with more cognitive load, because every entrypoint was a decision the operator had to make, and the operator (me) did not want to make decisions, the operator wanted to start working.

Where this travels

This post is the most portable one in the series. The discipline it describes is not specific to AI development, or to Precheck, or even to the Mech Suit Methodology. It is a general rule about developer experience that applies to any multi-project environment. If you maintain more than two projects, and you are not putting the operator surface for each project behind a small named-verb runbook, you are paying the startup cost every time you context-switch, and the cost is compounding in ways you probably have not measured.

Three concrete things you can steal from this post if none of the rest of the series applies to your work:

  1. Name commands for operator intent, not mechanism. up is better than docker-compose-up-with-build-flag. smoke is better than run-integration-tests-against-live-stack. The mechanism is an implementation detail. The intent is the interface.
  2. Name destructive things destructively. nuke is a deliberate choice. clean sounds safe and is not. If your command destroys data, the name should reflect that.
  3. Make the standard flow a copy-paste sequence. If the operator has to edit anything between commands — a config file, an environment variable, a path — the flow is broken. Fix it so they do not have to.

Back to the arc

The parent article's Phase 5 is the parallel-pipelines phase. It describes being able to run three to five orchestration loops at once and the way that capability expands what a single practitioner can responsibly build. This post is the specific receipt for why that capability works in practice — because each project in the pipeline has a runbook small enough to hold in working memory, and therefore the operator can switch between pipelines without paying a context-reload tax.

The next post is the series closer. It brings the determinism thread from post six back in a specific form: the four-state lesson lifecycle that makes the learning system auditable. It is the sharpest demonstration in the repo that you are still the pilot — because if you are going to trust a system that learns, you need to see what it learned, and the four states are how Precheck shows you.