Does adding context files actually help AI agents?

Do AGENTS.md Files Actually Help AI Coding Agents? Not as much as you expect

By Mike Nova

May 18, 2026

5 min read

Copy link

Table of contents

Heading H2

TL;DR // summary

ETH Zurich ran the first real test. LLM-generated AGENTS.md (the one your agent spits out with /init) drops success rate ~3% and jacks tokens cost +20%.
Human-written ones give a pathetic +4% bump on real issues but still cost you the same extra tokens because agents start over-testing everything.
If your repo already has a decent README, these files are pure duplication that makes agents dumber and slower.
Focus on writing real docs. They help both you and the agents. Win-win!

I’m sure many of us start every new project with Claude Code or Cursor by immediately creating a CLAUDE.md or AGENTS.md file. I personally used to think this would dramatically improve my AI agents performance and reduce token usage. My first command in every project was always init. Context files are supposed to guide AI agents and make them more familiar with the project before starting work. Turns out, based on the recent ETH Zurich study, these context files can do more harm if you already have basic documentation in your code.

What's the point of context files

Context files (like CLAUDE.md or AGENTS.md) are meant to be practical guides that keep agents on track, stop them from hallucinating tooling, and cut stupid repetitive mistakes.

They’re supposed to cover:

Project overview, folder structure, and key modules.
Build/run/test/deploy commands, env vars, and common pitfalls.
Coding standards and patterns (test coverage, test types etc.).
Locked-down tooling so LLMs don’t invent their own libraries. We don't want 10 different HTTP client!

Most of us were told the very first step is to create one. Either hand-write it or let the agent generate automatically it with /init. The idea was fewer wasted tokens on discovery. Turns out it was mostly hype.

The study that changes the game

The ETH Zurich team ran real numbers (Full paper PDF). They built their own benchmark: 138 issues from 12 niche Python repos that actually ship developer-written context files. They also ran SWE-bench Lite. Tested on four different agents so we can't blame model bias: Claude Sonnet-4.5, GPT-5.2, GPT-5.1 Mini and Qwen. Repos averaged 3337 files (up to 26k) - decent size.

Evaluation pipelines overview. Figure 3 from Gloaguen et al. ([arXiv:2602.11988](https://arxiv.org/pdf/2602.11988), CC BY 4.0)

‍

Study findings

They split the cases: repos with existing docs vs repos without any docs.

When basic docs already exist (most real repos):

LLM-generated context files reduce success rate by ~3%.
Human-written ones give only +4%.
Both increase tokens cost by ~20% in every single setting.
Agents follow the instructions too well, run extra tests, grep everything, and waste reasoning tokens.

When they removed all documentation:

LLM-generated context files finally help - +2.7% success rate.

The paper is blunt:

We therefore suggest omitting LLM-generated context files for the time being, contrary to agent developers' recommendations, and including only minimal requirements.

In normal repos with a decent README, context files are mostly duplication. They repeat what’s already there and add rules that make the agent over-think and slow down.

It's like wanting a quick burger. Normally you walk in, order in 3 minutes. But with a context file the manager grabs you at the door for a full kitchen tour, sourcing rules, quality checks, and exact procedure. You waste 15 minutes and way more energy for the same burger.

What about other opinions?

Some influencers and a smaller study (“On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents”) claim +28% speed and -16% tokens. That one only looked at wall-clock time on 10 repos with Codex and didn’t even check if the tasks actually got solved correctly. The ETH results are the first ones that actually measure real success rate on real issues.

Data is still thin and opinions are split. Fair enough.

Should I keep context files

Not enough evidence for 100% certainty, but the pattern is clear.

My take:

Focus on adding or improving real documentation (READMEs, architecture notes). It helps both humans and agents!
Skip context files if you already have basic docs. Delete them if they are outdated - don't confuse agents.
If you must keep one, keep it brutally minimal, only non-obvious stuff like exact build command or one weird dependency. No overviews, no fluff.

Stop following the vendor hype. The study shows most of us have been burning tokens and slowing agents down for nothing. Write better real docs instead. That’s the only context that actually matters.

No items found.

ABOUT AUTHOR

[ NAME ].....................Mike Nova

[ BACKGROUND ].......Software Engineer @ Top-Tier Banks

[ BIO ]..........................10 years at top banks building massive financial systems that generate billions in revenue. This blog documents the reality in tech and top banks without the PR bs. I focus on tech that gets the job done and write articles that expose corporate theater.

[ CONNECT ]................

Substack

Twitter (X)

[NAME ]
‍Mike Nova

[ BACKGROUND ]
Software Engineer @ Top-Tier Banking

[ BIO ]
I’ve spent the last 8 years maintaining legacy monoliths, migrating them to the cloud, and watching Agile coaches burn millions of dollars. This blog is where I document the reality of the tech industry without the LinkedIn PR filters. I write code that moves money, and articles that expose the fluff.

[ CONNECT ]