
TL;DR // summary
- ETH Zurich ran the first real test. LLM-generated AGENTS.md (the one your agent spits out with /init) drops success rate ~3% and jacks tokens cost +20%.
- Human-written ones give a pathetic +4% bump on real issues but still cost you the same extra tokens because agents start over-testing everything.
- If your repo already has a decent README, these files are pure duplication that makes agents dumber and slower.
- Focus on writing real docs. They help both you and the agents. Win-win!
I’m sure many of us start every new project with Claude Code or Cursor by immediately creating a CLAUDE.md or AGENTS.md file. I personally used to think this would dramatically improve my AI agents performance and reduce token usage. My first command in every project was always init. Context files are supposed to guide AI agents and make them more familiar with the project before starting work. Turns out, based on the recent ETH Zurich study, these context files can do more harm if you already have basic documentation in your code.
What's the point of context files
Context files (like CLAUDE.md or AGENTS.md) are meant to be practical guides that keep agents on track, stop them from hallucinating tooling, and cut stupid repetitive mistakes.
They’re supposed to cover:
- Project overview, folder structure, and key modules.
- Build/run/test/deploy commands, env vars, and common pitfalls.
- Coding standards and patterns (test coverage, test types etc.).
- Locked-down tooling so LLMs don’t invent their own libraries. We don't want 10 different HTTP client!
Most of us were told the very first step is to create one. Either hand-write it or let the agent generate automatically it with /init. The idea was fewer wasted tokens on discovery. Turns out it was mostly hype.
The study that changes the game
The ETH Zurich team ran real numbers (Full paper PDF). They built their own benchmark: 138 issues from 12 niche Python repos that actually ship developer-written context files. They also ran SWE-bench Lite. Tested on four different agents so we can't blame model bias: Claude Sonnet-4.5, GPT-5.2, GPT-5.1 Mini and Qwen. Repos averaged 3337 files (up to 26k) - decent size.

Study findings
They split the cases: repos with existing docs vs repos without any docs.
When basic docs already exist (most real repos):
- LLM-generated context files reduce success rate by ~3%.
- Human-written ones give only +4%.
- Both increase tokens cost by ~20% in every single setting.
- Agents follow the instructions too well, run extra tests, grep everything, and waste reasoning tokens.
When they removed all documentation:
- LLM-generated context files finally help - +2.7% success rate.
The paper is blunt:
We therefore suggest omitting LLM-generated context files for the time being, contrary to agent developers' recommendations, and including only minimal requirements.
In normal repos with a decent README, context files are mostly duplication. They repeat what’s already there and add rules that make the agent over-think and slow down.
It's like wanting a quick burger. Normally you walk in, order in 3 minutes. But with a context file the manager grabs you at the door for a full kitchen tour, sourcing rules, quality checks, and exact procedure. You waste 15 minutes and way more energy for the same burger.
What about other opinions?
Some influencers and a smaller study (“On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents”) claim +28% speed and -16% tokens. That one only looked at wall-clock time on 10 repos with Codex and didn’t even check if the tasks actually got solved correctly. The ETH results are the first ones that actually measure real success rate on real issues.
Data is still thin and opinions are split. Fair enough.
Should I keep context files
Not enough evidence for 100% certainty, but the pattern is clear.
My take:
- Focus on adding or improving real documentation (READMEs, architecture notes). It helps both humans and agents!
- Skip context files if you already have basic docs. Delete them if they are outdated - don't confuse agents.
- If you must keep one, keep it brutally minimal, only non-obvious stuff like exact build command or one weird dependency. No overviews, no fluff.
Stop following the vendor hype. The study shows most of us have been burning tokens and slowing agents down for nothing. Write better real docs instead. That’s the only context that actually matters.