Tips 11–15
Optimization: cost, speed, and compounding
These techniques compound. Applied consistently, they reduce cost, improve output quality, and make your configuration progressively more effective over time.
11
Category: Subagents · Impact: High (prevents stream timeouts)
Cap subagent scope to 20 minutes — Anthropic stream timeouts kill long agents
Anthropic's API has a stream-idle timeout that kills long-running agent sessions when Claude is thinking but not producing output. On complex tasks that require deep reasoning, this can terminate a subagent mid-execution — wasting tokens and producing no output.
The practical fix: cap each subagent task at approximately 20 minutes of work. If a task requires more, break it into a chain of smaller agents, each with a clear handoff. The orchestrating agent receives each result and passes it to the next.
Example: instead of "refactor this entire 2,000-line module," use "refactor functions 1–10, output results" then "given the refactored functions 1–10, refactor functions 11–20." Each subagent is scoped, completes reliably, and the chain produces the full result.
Why it matters: A 60-minute subagent that times out at minute 45 produces nothing and costs tokens. Five 12-minute subagents produce the same output reliably with zero timeout risk.
12
Category: Context · Impact: Medium-High
Use @ mentions for precise context loading — not "remember what I told you about X"
Claude Code supports @filename syntax to load specific files into the conversation context. This is significantly more effective than describing a file's contents or asking Claude to "remember" something from a previous session.
Good pattern: @brain/api_spec.md Review this API spec and then implement the /users endpoint. Claude gets the exact spec contents in context rather than a vague reference. This eliminates hallucinated API details and ensures Claude is working from the actual source of truth.
For frequently referenced documents — your API spec, your data models, your style guide — add loading instructions to CLAUDE.md: "When working on API endpoints, read @brain/api_spec.md before starting." This makes context loading automatic for common task types.
Why it matters: "Remember the API spec we discussed" is ambiguous and often produces hallucinated details. "@brain/api_spec.md" loads the exact document. Precision in context loading produces precision in output.
13
Category: Quality · Impact: High (prevents expensive rework)
Spec before build — every time, even for small features
The highest-cost mistake in Claude Code sessions is building the wrong thing. It's not slow — it's fast. Claude Code can build an entire feature in minutes. If that feature doesn't match what you actually need, you've lost minutes of session time and tokens, and you have to start over.
Before any non-trivial build task, ask Claude to write a three-line spec first: what it does, what its inputs and outputs are, and what done looks like. Review the spec. If it's wrong, fix it before code is written. The spec review takes 30 seconds. Fixing code written against the wrong spec takes minutes.
For API endpoints: define the schema before implementation. For UI components: define the props and expected behavior first. For scripts: define inputs, outputs, and edge cases. Three sentences of spec eliminates the most expensive category of rework.
Why it matters: Claude Code is fast enough that wrong implementations get built before you realize they're wrong. A spec is the verification that Claude understood the requirement correctly — done before any code exists to refactor.
14
Category: Cost · Impact: Medium (significant savings on large context)
Optimize cost: keep your CLAUDE.md stable to leverage prompt caching
Anthropic's API implements prompt caching on repeated context. When your CLAUDE.md is stable across sessions, the API can cache it — dramatically reducing the token cost of reading it at the start of every session. On large CLAUDE.md files (200+ lines), this can reduce context costs by 60–90%.
The implication: don't edit your CLAUDE.md constantly. Make changes deliberately, let the file stabilize, and the cache kicks in. Frequent small edits to CLAUDE.md defeat caching and increase costs. Batch changes, stabilize, then leave it alone.
Similarly: memory files you load repeatedly should be stable between sessions. If brain/reference/api_spec.md rarely changes, caching reduces its load cost significantly. Keep dynamic state out of stable reference files.
Why it matters: On heavy Claude Code usage, prompt caching on a stable CLAUDE.md can save $10–50/month in API costs. The savings compound as session volume increases.
15
Category: System Design · Impact: Compounding (gets better over time)
Build self-improving configurations — every bug fix should update the rules
The highest-leverage Claude Code pattern is one most users never implement: when Claude makes a mistake, don't just fix it. Update CLAUDE.md with a rule that prevents the same mistake from happening again.
Example: Claude generates TypeScript with any types. Fix the specific instance, then add to CLAUDE.md: "NEVER use 'any' type. Use 'unknown' with type guards, or define explicit interfaces. No exceptions." The rule prevents every future instance automatically.
Apply this recursively. Every correction becomes a rule. Every repeated question Claude asks you becomes an answer in CLAUDE.md. Every preference you state becomes a permanent constraint. Over weeks and months, your CLAUDE.md becomes a precise specification of your project's DNA — and Claude's output gets progressively better without additional instruction.
Why it matters: Without self-improvement, you correct the same mistakes repeatedly. With self-improvement, you correct each mistake exactly once. The compounding effect over 6 months of active Claude Code use is dramatic — configurations built this way require a fraction of the per-session correction that unmanaged configs require.