Selected developer quotes:
“I’m torn. I’d like to help provide updated data on this question but also I really like using AI!” — a developer from the original study early-2025 when asked to participate in the late-2025 study.
“I found I am actually heavily biased sampling the issues … I avoid issues like AI can finish things in just 2 hours, but I have to spend 20 hours. I will feel so painful if the task is decided as AI-disallowed.” — a developer from the new study noting selection effects when choosing what tasks to include in the study.
“my head’s going to explode if I try to do too much the old fashioned way because it’s like trying to get across the city walking when all of a sudden I was more used to taking an Uber.” — a developer from the new study noting selection effects when choosing what tasks to include in the study.



The gains, where they exist, are nowhere near that much. In some cases, it makes developers slower (even though they think they’re a bit faster):
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
Have you actually read the study? People keep citing this study without reading it.
They grabbed like 8 devs who did not have pre-existing set up workflows for optimizing AI usage, and just throw them into it as a measure of “does it help”
Imagine if I grabbed 8 devs who had never used neovim before and threw them into it without any plugins installed or configuration and tried to use that as a metric for “is nvim good for productivity”
People need to stop quoting this fuckass study lol, its basically meaningless.
Im a developer using agentic workflows with over 17 years experience.
I am telling you right now, with the right setup, I weekly turn 20 hour jobs into 20 minute jobs.
Predominantly large “bulk” operations that are mostly just boilerplate code that is necessary, where the AI has an existing huge codebase to draw from as samples and I just give it instructions of “see what already exists? implement more of that following <spec>”
A great example is integration testing where like 99% of the code is just boilerplate.
Arrange the same setup every time. Arrange your request following an openapi spec file. Send the request. Assert on the response based on the openapi spec.
I had an agent pump out 120 integration tests based on a spec file yesterday and they were, for the most part, 100% correct, yesterday. In like an hour.
The same volume of work would’ve easily taken me way longer.
What about developer burnout rates? Cause those same studies also say there was significantly less Dev burnout happening.
If anything my personal experience is the opposite. When using AI the way work wants me to, with multiple agents going in the background, I’ve completely lost any sort of “flow state” I normally get when focused on a problem. It’s no fun anymore, and the only thing keeping me going is working on my personal projects without AI in my free time… I didn’t get in to this to become an AI babysitter.
Yeah I get that. I just like avoiding having to do boring tasks is all so that I can work on the core problem I’m trying to solve. I don’t want to deal with code refactoring manually, I’d rather babysit this thing to do that piece by piece. It’d probably take me longer, cause id do something else on the side that I actually wanted to work on, but id be more content not having to manually do the tedious refactoring myself.
AI is not a catch all for all problems, if I’m thinking something thru very different set of tools for that. I might use an LLM for that but mainly as an interface over a vectordb and help me look things up, and not write or show me any code ever. Essentially a contextual grep or rg.
Sorry you’re being forced to use a hammer to make a surgical precision cut. That really sucks man.