Programming

26022 readers

279 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

UlrikHD@programming.dev

bugsmith@programming.dev

Spyro@programming.dev

METR AI Coding Research Inconclusive Because Dev Participants Refused to Complete Tasks Without AI (metr.org)

submitted 21 hours ago* (last edited 21 hours ago) by brianpeiris@lemmy.ca to c/programming@programming.dev

14 comments fedilink hide all child comments

Selected developer quotes:

“I’m torn. I’d like to help provide updated data on this question but also I really like using AI!” — a developer from the original study early-2025 when asked to participate in the late-2025 study.

“I found I am actually heavily biased sampling the issues … I avoid issues like AI can finish things in just 2 hours, but I have to spend 20 hours. I will feel so painful if the task is decided as AI-disallowed.” — a developer from the new study noting selection effects when choosing what tasks to include in the study.

“my head’s going to explode if I try to do too much the old fashioned way because it’s like trying to get across the city walking when all of a sudden I was more used to taking an Uber.” — a developer from the new study noting selection effects when choosing what tasks to include in the study.

you are viewing a single comment's thread
view the rest of the comments

[–] rimu@piefed.social 18 points 18 hours ago (1 children)

20 hours of work turns into 20 minutes

The gains, where they exist, are nowhere near that much. In some cases, it makes developers slower (even though they think they're a bit faster):

we find that when developers use AI tools, they take 19% longer than without - AI makes them slower.

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

[–] pixxelkick@lemmy.world 1 points 11 hours ago

Have you actually read the study? People keep citing this study without reading it.

To directly measure the real-world impact of AI tools on software development, we recruited 16 experienced developers from large open-source repositories (averaging 22k+ stars and 1M+ lines of code) that they’ve contributed to for multiple years. Developers provide lists of real issues (246 total) that would be valuable to the repository—bug fixes, features, and refactors that would normally be part of their regular work.

They grabbed like 8 devs who did not have pre-existing set up workflows for optimizing AI usage, and just throw them into it as a measure of "does it help"

Imagine if I grabbed 8 devs who had never used neovim before and threw them into it without any plugins installed or configuration and tried to use that as a metric for "is nvim good for productivity"

People need to stop quoting this fuckass study lol, its basically meaningless.

Im a developer using agentic workflows with over 17 years experience.

I am telling you right now, with the right setup, I weekly turn 20 hour jobs into 20 minute jobs.

Predominantly large "bulk" operations that are mostly just boilerplate code that is necessary, where the AI has an existing huge codebase to draw from as samples and I just give it instructions of "see what already exists? implement more of that following "

A great example is integration testing where like 99% of the code is just boilerplate.

Arrange the same setup every time. Arrange your request following an openapi spec file. Send the request. Assert on the response based on the openapi spec.

I had an agent pump out 120 integration tests based on a spec file yesterday and they were, for the most part, 100% correct, yesterday. In like an hour.

The same volume of work would've easily taken me way longer.