Skip to content
/

Taming the Unpredictable

The more instructions you give an AI agent, the worse it gets. Prompt-based agents are hitting a ceiling. Scripts are the way out.

The Gold Rush

We’re in the early days, and nobody really knows where this is going. Like gold rush pioneers, we’re running on instinct, testing, failing, starting over. The tools change every month, January’s certainties become March’s punchlines.

What we’re all looking for as developers, deep down, is predictability. Knowing that if you do X, Y happens. That’s why we build tests, types, linters, pipelines. And AI is the antithesis of all that: probabilistic by nature, brilliant most of the time, but off the mark just often enough that you can’t look the other way.

The problem is that we’re at the end of the chain. We control neither the system prompt, nor the model’s training, nor the harness’s architectural decisions. We consume a model frozen after training, unable to learn from its mistakes across sessions, and the only thing we actually have control over is what happens around it.

The Age of Instructions

AI tool makers (Anthropic, Cursor, OpenAI, Codex) have all converged on the same idea: give developers ways to steer the agent’s behavior. Context files (AGENTS.md, CLAUDE.md), rules (.cursorrules), skills, custom commands. The more you document the rules and project context, the better the AI would behave.

Except all of this relies on instructions, and the budget is tighter than you’d think. As I write this, according to HumanLayer, a state-of-the-art LLM can follow roughly 150 to 200 with reasonable consistency, and Claude Code’s system prompt already consumes ≈50. And these instructions burn budget constantly, even when they’re irrelevant: “use pnpm” takes up context even when the agent is editing CSS. In practice, instructions get lost: AI has nascent memory between sessions, but no awareness of its own gaps, and it’s not uncommon to see it produce code that compiles but misses conventions you’ve explicitly spelled out.

What Research Tells Us

The intuition “more instructions = better results” is appealing, but research suggests the opposite.

SkillsBench (February 2026) shows that model-generated skills degrade results, that hand-written skills help but only up to 3, and that exhaustive documentation performs worse than no documentation at all.

Same story for context files. ETH Zurich shows that LLM-generated AGENTS.md files reduce the success rate by about 3% while increasing costs by 20%. Manually written files fare slightly better (+4%), but agents follow the instructions too literally. And when you remove existing repo documentation, context files suddenly become useful, proving they were just duplicating information already discoverable in the code.

Vercel achieves a 100% pass rate with an ultra-condensed AGENTS.md (8 KB), but their own conclusion: “small wording tweaks produce large behavioral swings.” Fragile.

The String Match Trap

The Ralph Wiggum method, popularized by Geoffrey Huntley, is a solid pattern for gaining predictability. The idea: run an agent in a loop, each iteration starting with a fresh context, with progress persisted in files and git history.

while :; do cat PROMPT.md | claude-code ; done

Where it breaks down is completion detection. The default implementation asks the agent to emit a token (COMPLETE) and checks for its presence in the output with a simple string match.

In three months of use, I’ve had three loops fail silently, errors detected as successes. The LLM, faced with a failure, writes something like “the task failed, so I should not write COMPLETE.” The token is in the output. The string match passes. The loop stops. Everything is green. Nothing works.

The LLM didn’t disobey. It reasoned, and its reasoning contained exactly the token it was trying not to emit, much like Wegner’s ironic process theory: “don’t think of a white bear” guarantees you’ll think of one. If detection relied on a deterministic script (checking that tests pass, that the build compiles), this bug wouldn’t exist.

The Opposite Approach: My Current Flow

The approach emerging in recent days is the opposite of instinct: fewer instructions, not more. Matt Pocock reaches the same conclusion: keep the bare minimum in context files, and for anything that must be guaranteed, use deterministic mechanisms.

Hooks are powerful because they’re deterministic. This isn’t “prompt the agent to remember to run tests.” It’s “tests run because the workflow requires it.”

Jan Van EyckGuardrails for Agentic Coding, 2026

On this project, every use case that was an instruction became a script:

CLAUDE.md instructionDeterministic equivalent
“Format your code with oxfmt”PostToolUse hook that reformats every modified file
“Use pnpm, not npm”PreToolUse hook that rejects npm commands
“Never merge a PR yourself”PreToolUse hook that blocks gh pr merge
“Update modifiedAt when you edit an article”Pre-commit script that updates the date automatically
“Don’t add co-authoring to commits”commit-msg hook that strips the line

Claude Code exposes lifecycle hooks, shell scripts that run automatically when the agent performs certain actions, either before (PreToolUse) or after (PostToolUse).

“Format your code with oxfmt”

PostToolUse hook: triggers after every file modification. The matcher filters the relevant tools, and exit 0 ensures the hook never blocks the agent even if formatting fails. The AI doesn’t need to think about it.

.claude/settings.json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write|NotebookEdit",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/format-on-save.sh",
            "timeout": 30
          }
        ]
      }
    ]
  }
}
.claude/hooks/format-on-save.sh
jq -r '.tool_input.file_path // .tool_input.notebook_path' \
  | xargs oxfmt --write --no-error-on-unmatched-pattern 2>/dev/null
exit 0

“Use pnpm, not npm”

PreToolUse hook: rejects the command before it runs, exit code 2, explicit error message. The agent receives the message and adapts. The instruction “use pnpm” relies on the model’s goodwill. The hook that rejects npm install is a wall.

.claude/hooks/block-npm.sh
jq -r '.tool_input.command' \
  | grep -qE '^npm\b|\bnpm install\b|\bnpm i\b' \
  && echo 'Blocked: use pnpm instead of npm.' >&2 \
  && exit 2
exit 0

“Never merge a PR yourself”

Same mechanism. An accidental merge is reversible (you rebase), but it’s the kind of cleanup you’d rather avoid.

.claude/hooks/block-pr-merge.sh
jq -r '.tool_input.command' \
  | grep -q 'gh pr merge' \
  && echo 'Blocked: merging requires human approval.' >&2 \
  && exit 2
exit 0

“Update modifiedAt when you edit an article”

The Husky pre-commit hook is nothing new: linting, formatting, standard stuff. What’s interesting is taking it further with domain-specific scripts. The script only triggers when actual content has changed, to avoid an infinite loop with its own modification.

.husky/pre-commit
for file in $(git diff --cached --name-only --diff-filter=M -- 'content/*.mdx'); do
  content_changed=$(git diff --cached -U0 "$file" \
    | grep '^[+-]' | grep -v '^[+-]\{3\}' \
    | grep -v '^[+-]modifiedAt:' | head -1)
  if [ -z "$content_changed" ]; then
    continue
  fi

  if grep -q '^modifiedAt:' "$file"; then
    sedi "s/^modifiedAt:.*$/modifiedAt: \"$TODAY\"/" "$file"
  else
    sedi "/^publishedAt:.*/a\\
modifiedAt: "2026-03-02"
" "$file"
  fi
  git add "$file"
done

And AI here? It’s the one writing those bespoke scripts for your project.

“Don’t add co-authoring to commits”

Gotcha: the Co-Authored-By: Claude line is injected by Claude Code’s configuration, not by the model. Asking it not to do this in a CLAUDE.md will work most of the time, but not every time. The fix lives in settings.json, not in instructions.

.claude/settings.json
{
  "includeCoAuthoredBy": false
}

Filtering What the Model Sees

There’s a less obvious category: not fixing code or blocking an action, but filtering what the model sees. PostToolUse hooks can rewrite a tool’s output before it enters the context. Truncate a 20 KB build output to 10 KB, compress a 500-line file into function signatures, strip the <system-reminder> blocks that accumulate over the conversation. Every byte that doesn’t enter the context is budget freed up for reasoning. claude-warden takes this logic to its conclusion: token governance, truncation, structural compression, per-subagent budgets, all in Bash and jq.

What Stays in Instructions

Not everything can be automated. My /commit command remains a prompt (conventional commits, batching logic, scope detection) because it requires judgment, not verification. The AI needs to decide whether a change is a feat or a fix, whether two files belong in the same commit or not, and no script can do that.

The rule I follow:

  • Repeatable and verifiable → script (hook, linter, pre-commit)
  • Contextual and requires judgment → instruction (command, skill)
  • When in doubt → script

And above all: no 200-line instruction file that the model will skim and that’ll go stale within weeks. The project’s content (the code, the types, the file structure) is already discoverable. The AI knows how to read a package.json, a tsconfig.json, a directory tree. Spelling it out in a rules file just adds noise. If I have a CLAUDE.md, it’s a few lines, as few as possible, and only for repeated mistakes I haven’t managed to extract into a script. The useful exercise is to reread every line of your instruction file and ask: could a script guarantee this? If so, the line doesn’t belong there.

We have a natural bias toward documenting everything for the AI, the way we would for a new developer. But a new developer reads the docs once and remembers. The AI rereads them every session, and every context token spent on our rules is one less token available for reasoning.

When Instructions Become Gates

There’s one case where instructions retain an advantage over scripts: multi-step creative tasks where each phase requires judgment, but the transition between phases can be locked down.

I built a skill to automate React component creation from Figma, leveraging a design system and design tokens. The workflow is broken into five stages, inspired by the Ralph Wiggum method:

  1. Study: analyze the design tokens and Figma file, extract variables, spacing, typography
  2. Architecture: define the component and sub-component breakdown, props, variants
  3. Implementation: generate the code. Erratic by design, this is a first pass, not the final result
  4. Refinement loop: compare the output to the design, list discrepancies, fix, re-compare, until zero differences remain
  5. Validation: tests, accessibility, design system consistency

Each stage has sine qua non conditions for writing COMPLETE and moving to the next. Stage 1 doesn’t end until tokens are extracted. Stage 4 doesn’t end as long as visual discrepancies remain. These are gates, not suggestions.

The stage 3 implementation is deliberately loose, and that’s counterintuitive, but rather than asking the AI to be perfect on the first try (which it won’t be), I accept imperfection and correct it in a dedicated loop. Stage 4 is stage 3’s safety net.

This is a skill, a tool-agnostic instruction file. It works with Claude Code, Cursor, or any agent that supports skills. But to truly make the process reliable, the next step would be to turn each gate into an actual script. A for loop Ralph Wiggum-style, where each iteration relaunches the agent with a fresh context on a single stage, and completion is verified by code, not by the LLM.

The skill with its gates is a pragmatic compromise. The loop with deterministic checks is the target. Between the two lies the time you have and the tools you know.

Conclusion

We’re still early, the tools change every month, and what works today might be obsolete tomorrow. Even the stakeholders aren’t sure where they’re heading.

The official documentation for Claude Code recommends running /init to generate a CLAUDE.md, while research shows that LLM-generated files degrade performance. Prompting best practices went in just a few months from “use CRITICAL, NEVER, ALWAYS” to “be positive, Claude 4 is more sensitive.” All while writing on the same page that you can “tune instructions by adding emphasis (e.g., ‘IMPORTANT’ or ‘YOU MUST’) to improve adherence.”

When you’re at the end of the chain, probabilistic algorithms feel like inexact sciences. Our leverage is the same as always: test, observe, conclude. But the principle won’t change: predictability is built with code, not with words.