If an engineering agent keeps touching the wrong files, the problem is often not the model. It is the context packet.
Too many teams still hand an agent a giant repo dump, a vague task, and hope retrieval or tool use will clean it up. That works right up until the agent edits a similarly named module, misses a hidden invariant, or burns half its token budget re-reading noise.
What actually helps is a bounded packet: a task manifest, a repo map, a short evidence bundle, and a few hard constraints. This is the packet shape I would use for engineering agents that need to produce reviewable edits.
Why this matters
Good agents do not just need more context. They need the right context, in an order that helps them plan before they edit.
In practice, packet quality changes bad-edit rate, reviewability, latency, and how often a human has to rescue the run. That makes it one of the highest-leverage parts of an agent workflow.
Architecture or workflow overview
A context packet should be assembled before the first write-capable tool call, then refreshed at checkpoints when the task uncovers something materially new.
flowchart LR
A[Issue or task] --> B[Task classifier]
B --> C[Repo map and invariant lookup]
C --> D[Relevant file scorer]
D --> E[Context packet builder]
E --> F[Agent plan]
F --> G[Edit or tool execution]
G --> H[Verification and packet refresh]Implementation details
Start with a task manifest, not just prose
I like a small YAML manifest because it forces structure before the agent starts improvising.
kind: engineering-task
objective: Add retry-safe webhook delivery to the background worker
success:
- duplicate webhook sends are prevented
- failed deliveries are retried with backoff
- existing metrics still emit
constraints:
- do not change public API routes
- keep edits inside worker/ and shared/queue/
- add tests for retry dedupe behavior
verify:
- pnpm test worker/retry.test.ts
- pnpm lint
avoid:
- touching billing/
- changing database migrationsThis is boring, which is good. The agent now knows where success lives, where it should not wander, and how the result will be checked.
Build a repo map with file roles, not raw tree dumps
A raw directory listing is cheap to generate but not very useful. A repo map should explain what matters.
{
"areas": [
{
"path": "worker/dispatcher.ts",
"role": "dequeues jobs and handles retry state transitions",
"editLikelihood": "high"
},
{
"path": "shared/queue/idempotency.ts",
"role": "idempotency key generation and lease checks",
"editLikelihood": "medium"
},
{
"path": "billing/",
"role": "separate domain, not relevant to webhook retries",
"editLikelihood": "forbidden"
}
],
"invariants": [
"job attempt count is monotonic",
"delivery metrics use the existing queue labels",
"duplicate sends must fail closed"
]
}I would rather give an agent eight annotated files than sixty unlabelled ones.
Score relevance before bundling source excerpts
The packet builder should rank candidate files using cheap signals first, like import proximity, symbol overlap, test names, and issue keywords.
export function scoreCandidate(file: RepoFile, task: TaskManifest): number {
let score = 0;
if (task.objectiveWords.some(word => file.path.includes(word))) score += 3;
if (file.exports.some(symbol => task.symbolHints.includes(symbol))) score += 4;
if (file.tags.includes('entrypoint')) score += 2;
if (file.tags.includes('forbidden')) score -= 100;
if (file.recentlyChangedWithTests) score += 2;
return score;
}Bundle evidence in layers
| Layer | What goes in it | Why it exists | Failure if missing |
|---|---|---|---|
| task | objective, constraints, verify commands | keeps the agent pointed at the actual job | vague plans and scope drift |
| map | file roles, invariants, ownership hints | gives structure before source code | edits the right concept in the wrong place |
| evidence | short excerpts from top files | provides implementation truth | agent hallucinates unseen details |
| guardrails | no-go paths, approval notes, rollout cautions | limits blast radius | locally reasonable but operationally bad actions |
Refresh the packet after major discoveries
The first packet is never perfect. If the agent learns a key invariant halfway through, capture it and rebuild the packet before the next write step.
def refresh_packet(packet, finding):
if finding.kind == "invariant":
packet["repo_map"]["invariants"].append(finding.text)
if finding.kind == "new_hot_file":
packet["evidence_files"].append(finding.path)
packet["version"] += 1
return packet$ packet build --task tasks/retry-webhook.yaml selected 7 files added 3 invariants excluded 2 forbidden paths packet version: 4 estimated tokens: 6840 verification commands: 2
What went wrong and the tradeoffs
Packet bloat
Teams discover context packets, then immediately turn them into mini wikis. Once the packet becomes a dumping ground, the agent starts skipping important bits because everything looks equally important.
Stale repo maps
A stale map is worse than no map if it confidently points the agent at old architecture. If you keep maps in-repo, regenerate or review them when major modules move.
Overfitting to path names
Simple relevance scoring can be fooled by similar names. That is why invariant notes and file-role annotations matter.
What I would not do: I would not stuff entire files into the packet by default, I would not let the agent pick forbidden paths and justify later, and I would not trust vector retrieval alone to understand repo boundaries.
Practical checklist
- define success criteria before selecting files
- annotate allowed and forbidden paths explicitly
- include 5 to 12 high-signal files, not 50 low-signal ones
- attach verify commands the agent can run immediately
- record invariants in plain language
- rebuild the packet after major discoveries
- log the packet version used for each edit batch
Conclusion
Context packets are one of the highest-leverage boring tools in an engineering-agent stack. They reduce wasted tokens, improve edit quality, and make failures easier to explain. The main trick is staying selective. Smaller, fresher, and more explicit beats bigger almost every time.