Taming the Unpredictable

The White Bear

For three months, I developed daily using the Ralph Wiggum method. The idea: run an agent in a loop, each iteration starting with a fresh context, progress persisted in files and git history, completion detected by a token: when the agent writes <promise>COMPLETE</promise>, the iteration ends and the loop moves on.

while :; do cat PROMPT.md | claude-code ; done

Three times, it failed silently. The LLM, faced with a failure, wrote something like “the task failed, so I should not write <promise>COMPLETE</promise>.” The token was in the output. The string match passed. The loop stopped. Everything was green. Nothing worked.

The LLM didn’t disobey. It reasoned, and its reasoning contained exactly the token it was trying not to emit. Wegner’s ironic process theory: “don’t think of a white bear” guarantees you’ll think of one.

If detection had relied on a deterministic script (checking that tests pass, that the build compiles), this bug wouldn’t exist.

Instructions Don’t Scale

AI tool makers have all converged on the same idea: give developers ways to steer the agent. Context files (AGENTS.md, CLAUDE.md), rules (.cursorrules), skills, custom commands. Document the rules, document the project context, and the AI behaves.

Except all of this relies on instructions, and the budget is tighter than you’d think. The number of instructions an LLM can follow with reasonable consistency is limited, and the harness’s system prompt already consumes a good chunk of that budget. These instructions burn budget constantly (tokens and money), even when irrelevant: “use pnpm” takes up context even when the agent is editing CSS. And the AI has no awareness of its own gaps. It’ll produce code that compiles but misses conventions you’ve explicitly spelled out.

LLM-generated AGENTS.md files degrade results while increasing costs. Vercel passes all their evals with an ultra-condensed AGENTS.md, but their own conclusion: “small wording tweaks produce large behavioral swings.” More instructions doesn’t mean better results. It means more noise.

Scripts, Not Instructions

The approach that’s working for me is the opposite of instinct: fewer instructions, not more. For anything that must be guaranteed, use deterministic mechanisms. The instruction “use pnpm” relies on the model’s goodwill. A hook that rejects npm install is a wall.

On this project, every use case that was an instruction became a script:

CLAUDE.md instruction	Deterministic equivalent
“Format your code with oxfmt”	`PostToolUse` hook that reformats every modified file
“Use pnpm, not npm”	`PreToolUse` hook that rejects npm commands
“Never merge a PR yourself”	`PreToolUse` hook that blocks `gh pr merge`
“Update modifiedAt when you edit an article”	Pre-commit script that updates the date automatically
“Don’t add co-authoring to commits”	…

Claude Code exposes lifecycle hooks: shell scripts that run automatically when the agent performs certain actions, either before (PreToolUse) or after (PostToolUse).

“Format your code with oxfmt”

PostToolUse hook: triggers after every file modification. The matcher filters the relevant tools, and exit 0 ensures the hook never blocks the agent even if formatting fails. The AI doesn’t need to think about it.

.claude/settings.json

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write|NotebookEdit",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/format-on-save.sh",
            "timeout": 30
          }
        ]
      }
    ]
  }
}

.claude/hooks/format-on-save.sh

jq -r '.tool_input.file_path // .tool_input.notebook_path' \
  | xargs oxfmt --write --no-error-on-unmatched-pattern 2>/dev/null
exit 0

“Use pnpm, not npm”

PreToolUse hook: rejects the command before it runs, exit code 2, explicit error message. The agent receives the message and adapts.

.claude/hooks/block-npm.sh

jq -r '.tool_input.command' \
  | grep -qE '^npm\b|\bnpm install\b|\bnpm i\b' \
  && echo 'Blocked: use pnpm instead of npm.' >&2 \
  && exit 2
exit 0

“Never merge a PR yourself”

Same mechanism. An accidental merge is reversible (you rebase), but it’s the kind of cleanup you’d rather avoid.

.claude/hooks/block-pr-merge.sh

jq -r '.tool_input.command' \
  | grep -q 'gh pr merge' \
  && echo 'Blocked: merging requires human approval.' >&2 \
  && exit 2
exit 0

“Update modifiedAt when you edit an article”

The Husky pre-commit hook is nothing new: linting, formatting, standard stuff. What’s interesting is taking it further with domain-specific scripts. The script only triggers when actual content has changed, to avoid an infinite loop with its own modification.

.husky/pre-commit

for file in $(git diff --cached --name-only --diff-filter=M -- 'content/*.mdx'); do
  content_changed=$(git diff --cached -U0 "$file" \
    | grep '^[+-]' | grep -v '^[+-]\{3\}' \
    | grep -v '^[+-]modifiedAt:' | head -1)
  if [ -z "$content_changed" ]; then
    continue
  fi

  # Only check/update within frontmatter (between first two --- markers)
  frontmatter=$(awk '/^---$/{n++; if(n==2) exit} n==1{print}' "$file")
  if echo "$frontmatter" | grep -q '^modifiedAt:'; then
    awk -v today="$TODAY" '
      /^---$/ { n++ }
      n==1 && /^modifiedAt:/ { $0 = "modifiedAt: \"" today "\"" }
      { print }
    ' "$file" > "$file.tmp" && mv "$file.tmp" "$file"
  else
    awk -v today="$TODAY" '
      /^---$/ { n++ }
      { print }
      n==1 && /^publishedAt:/ { print "modifiedAt: \"" today "\"" }
    ' "$file" > "$file.tmp" && mv "$file.tmp" "$file"
  fi
  git add "$file"
done

And AI here? It’s the one writing those bespoke scripts for your project.

“Don’t add co-authoring to commits”

Gotcha: the Co-Authored-By: Claude line is injected by Claude Code’s configuration, not by the model. Asking it not to do this in a CLAUDE.md will work most of the time, but not every time. The fix lives in the attribution setting of settings.json.

.claude/settings.json

{
  "attribution": {
    "commit": "",
    "pr": "",
    "sessionUrl": false
  }
}

Filtering What the Model Sees

There’s a less obvious category: not fixing code or blocking an action, but filtering what the model sees. PostToolUse hooks can rewrite a tool’s output before it enters the context. Truncate a 20 KB build output to 10 KB, compress a 500-line file into function signatures, strip the <system-reminder> blocks that accumulate over the conversation. Every byte that doesn’t enter the context is budget freed up for reasoning. claude-warden takes this logic to its conclusion: token governance, truncation, structural compression, per-subagent budgets, all in Bash and jq.

What Stays in Instructions

Not everything can be automated. My /commit command remains a prompt (conventional commits, batching logic, scope detection) because deciding is the whole job: whether a change is a feat or a fix, whether two files belong in the same commit. No script can do that.

The rule I follow:

Repeatable and verifiable → script (hook, linter, pre-commit)
Contextual and requires judgment → instruction (command, skill)
When in doubt → script

And above all: no 200-line instruction file that the model will skim and that’ll go stale within weeks. The project’s content (the code, the types, the file structure) is already discoverable. The AI knows how to read a package.json, a tsconfig.json, a directory tree. Spelling it out in a rules file just adds noise. If I have a CLAUDE.md, it’s a few lines, as few as possible, and only for repeated mistakes I haven’t managed to extract into a script.

We have a natural bias toward documenting everything for the AI, the way we would for a new developer. But a new developer reads the docs once and remembers. The AI rereads them every session, and every context token spent on our rules is one less token available for reasoning.

When Instructions Become Gates

There’s one case where instructions keep an advantage: multi-step creative tasks where each phase requires judgment, but the transition between phases can be locked down.

I built a skill to automate React component creation from Figma. The workflow is broken into stages, with deterministic checks between each:

Each stage has conditions that must be met before moving to the next. These are gates, not suggestions. The implementation stage is deliberately loose. Rather than asking the AI to be perfect on the first try, I accept imperfection and correct it in a dedicated refinement loop.

Today this is a single skill file, one long prompt with gates between stages. Each stage has verification scripts, but it’s up to the model to run them. “After completing this stage, verify the output by running the corresponding script.” And the agent carries the full context across all stages. The “study” phase loads design tokens, Figma specs, typography rules, and all of that is still in context when the agent gets to “refinement,” where it only needs the screenshot and the generated code.

The target is a swarm: one agent per stage, each with a minimal context scoped to its task, and a shell script as orchestrator. The orchestrator launches each agent, checks the output deterministically, and feeds only the relevant artifacts to the next. The gates become actual scripts between processes, not instructions inside one.

The skill with its gates is a pragmatic compromise. The swarm with deterministic checks is the target. Between the two lies the time you have, and how far the levers you’re given can take you.

Conclusion

We’re still early, and the tooling is still being built. Official docs recommend /init to generate a CLAUDE.md while research shows LLM-generated files degrade performance. When you’re at the end of the chain, the only thing you control is what happens around the model.

Predictability is built with code, not with words.

For a broader view, see my AI manifesto.