AI coders think they’re 20% faster — but they’re actually 19% slower : techtakes

[–] SpaceNoodle@lemmy.world 89 points 3 months ago (5 children)

Devs are famously bad at estimating how long a software project will take.

No, highly complex creative work is inherently extremely difficult to estimate.

Anyway, not shocked at all by the results. This is a great start that begs for larger and more rigorous studies.

[–] Feyd@programming.dev 26 points 3 months ago

You're absolutely correct that the angle approach that statement is bullshit. There is also that they want to think making software is not highly complex creative work but somehow is just working an assembly line and the software devs are gatekeepers that don't deserve respect.

load more comments (4 replies)

[–] swlabr@awful.systems 48 points 3 months ago (1 children)

Megacorp LLM death spiral:

Megacorp managers at all levels introduce new LLM usage policies.
Productivity goes down (see study linked in post)
Managers make the excuse that this is due to a transitional period in LLM policies.
Policies become mandates. Beatings begin and/or intensify.
Repeat from 1.

[–] wizardbeard@lemmy.dbzer0.com 16 points 3 months ago (2 children)

I've been through the hellscape where managers used missed metrics as evidence for why we didn't need increased headcount on an internal IT helpdesk.

That sort of fuckery is common when management gets the idea in their head that they can save money on people somehow without sacrificing output/quality.

I'm pretty certain they were trying to find an excuse to outsource us, as this was long before the LLM bubble we're in now.

[–] swlabr@awful.systems 16 points 3 months ago

oh, absolutely. I mean you could sub out "LLM" with any bullshit that management can easily spring on their understaff. Agile, standups, return to office, the list goes on. Management can get fucked

[–] froztbyte@awful.systems 12 points 3 months ago

I wish I could make more people both know about, and understand, Goodhart’s law

[–] TommySoda@lemmy.world 47 points 3 months ago

As someone that has had to double check peoples code before, especially those that don't comment appropriately, I'd rather just write it all again myself than try and decipher what the fuck they were even doing.

[+] hankg@friendica.myportal.social 42 points 3 months ago (3 children)

[deleted]

[–] Silic0n_Alph4@lemmy.world 23 points 3 months ago (1 children)

I have the deal of a lifetime for you.

I represent a group of investors in possession of a truly unique NFT that has been recently valued at over $100M. We will invest this NFT in your 100x business - in return you transfer us the difference between the $100M investment and the excess value of the NFT. Standard rich people stuff, don’t worry about it.

Let me know when you’re ready to unlock your 100x potential and I’ll make our investment available via a suitable escrow service.

[–] o7___o7@awful.systems 9 points 3 months ago (1 children)

Mark Zuckerberg would like to know your location

[–] logi@lemmy.world 9 points 3 months ago

Don't be silly. Mark Zuckerberg already knows our location.

[–] Threeme2189@sh.itjust.works 7 points 3 months ago

Something something grindset mindset

[–] dgerard@awful.systems 25 points 3 months ago (3 children)

ahahaha holy shit. I knew METR smelled a bit like AI doomsday cultists and took money from OpenPhil, but those "open source" projects and engineers? One of them was LessWrong.

Here's a LW site dev whining about the study, he was in it and i think he thinks it was unfair to AI

I think if people are citing in another 3 months time, they'll be making a mistake

dude $NEXT_VERSION will be so cool

so anyway, this study has gone mainstream! It was on CNBC! I urge you not to watch that unless you have a yearning need to know what the normies are hearing about this shit. In summary, they are hearing that AI coding isn't all that actually and may not do what the captains of industry want.

around 2:30 the two talking heads ran out of information and just started incorrecting each other on the fabulous AI future, like the worst work lunchroom debate ever but it's about AI becoming superhuman

the key takeaway for the non techie businessmen and investors who take CNBC seriously ever: the bubble starts not going so great

[–] BigMuffN69@awful.systems 10 points 3 months ago* (last edited 3 months ago)

Yeah, METR was the group that made the infamous AI IS DOUBLING EVERY 4-7 MONTHS GRAPH where the measurement was 50% success at SWE tasks based on the time it took a human to complete it. Extremely arbitrary success rate, very suspicious imo. They are fanatics trying to pinpoint when the robo god recursive self improvement loop starts.

[–] diz@awful.systems 6 points 3 months ago

I think if people are citing in another 3 months time, they’ll be making a mistake

In 3 months they'll think they're 40% faster while being 38% slower. And sometime in 2026 they will be exactly 100% slower - the moment referred to as "technological singularity".

[–] scruiser@awful.systems 6 points 3 months ago

Here’s a LW site dev whining about the study, he was in it and i think he thinks it was unfair to AI

There a complete lack of introspection. It seems like the obvious conclusion to draw from a study showing people's subjective estimates of their productivity with LLMs were the exact opposite of right would inspire him to question his subjectively felt intuitions and experience but instead he doubles down and insists the study must be wrong and surely with the latest model and best use of it it would be a big improvement.

[–] ArchmageAzor@lemmy.world 22 points 3 months ago

5% "coding"

95% cleanup

[–] NigelFrobisher@aussie.zone 21 points 3 months ago (2 children)

I have an LLM usage mandate in my performance review now. I can’t trust it to do anything important, so I’ll get it to do incredibly noddy things like deleting a clause (that I literally always have highlighted) or generate documentation that’s more long-winded than just reading the code and then go to the bathroom while it happens.

[–] Threeme2189@sh.itjust.works 13 points 3 months ago (1 children)

Are you fucking serious?

[–] dgerard@awful.systems 18 points 3 months ago (1 children)

this sort of bloody stupid metric is widespread, i've heard about it widely

[–] froztbyte@awful.systems 7 points 3 months ago

goodhart's law's zombie era

[–] purplemonkeymad@programming.dev 9 points 3 months ago

Gotta justify all that money that they have just spent without any trials, testing or end user input.

[–] RememberTheApollo_@lemmy.world 20 points 3 months ago

Anyone who has had to unfuck someone else’s work knows it would have been faster to do the work correctly from scratch the first time.

[–] Tar_alcaran@sh.itjust.works 19 points 3 months ago (4 children)

I just want to point out that every single heavily downvoted, idiotic pro-AI reply on this post is from a .ml user (with one programming.dev thrown in).

I wonder which way the causation flows.

load more comments (4 replies)

[–] HugeNerd@lemmy.ca 18 points 3 months ago (1 children)

Software and computers are a joke at this point.

Computers no longer solve real problems and are now just used to solve the problems that overly complex software running on monstrous cheap hardware create.

"Hey I'd like to run a simple electronics schematic program like we had in the DOS days, it ran in 640K and responded instantly!"

"OK sure first you'll need the latest Windows 11 with 64G of RAM and 2TB of storage, running on at least 24 cores, then you need to install a container for the Docker for the VM for the flatpak for the library for the framework because the programmer liked the blue icon, then make sure you are always connected to the internet for updates or it won't run, and somehow the program will still just look like a 16 bit VB app from 1995."

"Well that sounds complicated, where's the support webpage for installing the program in Windows 7?"

"Do you have the latest AI agents installed in your web browser?"

"It's asking me to click OK but I didn't install the 1GB mouse driver that sends my porn browsing habits to Amazon..."

"Just click OK on all the EULAs so you lose the right to the work you'll create with this software, then install a few more dependencies, languages, entire VMs written in byte code compiled to HTML to run on JAVA, then make sure you have a PON from your ISP otherwise how can you expect to have a few kilobytes of data be processed on your computer? This is all in the cloud, baby!"

load more comments (1 replies)

[–] cstross@wandering.shop 13 points 3 months ago (3 children)

@dgerard What fascinates me is *why* coders who use LLMs think they're more productive. Is the complexity of their prompt interaction misleading them as to how effective the outputs it results in are? Or something else?

[–] bigfondue@lemmy.world 13 points 3 months ago (1 children)

Here's a random guess. They are thinking less, so time seems to go by quicker. Think about how long 2 hours of calculus homework seems vs 2 hours sitting on the beach.

[–] V0ldek@awful.systems 8 points 3 months ago (1 children)

This is such a wild example to me because sitting at beach is extremely boring and takes forever whereas doing calculus is at least engaging so time flies reasonably quick.

Like when I think what takes the longest in my life I don't think "those times when I'm actively solving problems", I think "those times I sit in a waiting room at the doctors with nothing to do" or "commuting, ditto".

load more comments (1 replies)

[–] HedyL@awful.systems 10 points 3 months ago* (last edited 3 months ago) (1 children)

What fascinates me is why coders who use LLMs think they’re more productive.

As @dgerard@awful.systems wrote, LLM usage has been compared to gambling addiction: https://pivot-to-ai.com/2025/06/05/generative-ai-runs-on-gambling-addiction-just-one-more-prompt-bro/

I wonder to what extent this might explain this phenomenon. Many gambling addicts aren't fully aware of their losses, either, I guess.

load more comments (1 replies)

[–] hedgehog@ttrpg.network 11 points 3 months ago

From the blog post referenced:

We do not provide evidence that:

AI systems do not currently speed up many or most software developers

Seems the article should be titled “16 AI coders think they’re 20% faster — but they’re actually 19% slower” - though I guess making us think it was intended to be a statistically relevant finding was the point.

That all said, this was genuinely interesting and is in-line with my understanding of the human psychology that’s at play. It would be nice to see this at a wider scale, broken down across different methodologies / toolsets and models.

[–] shnizmuffin@lemmy.inbutts.lol 10 points 3 months ago (1 children)

@dgerard@awful.systems who is your illustrator? These are consistently great.

[–] dgerard@awful.systems 17 points 3 months ago (1 children)

these are stock images! Which are surprisingly cheap. By Valeriy Kachaev, who puts stuff up as Studiostoks on a pile of stock image sites. His pics are bizarre and keep being the perfect thing.

[–] HedyL@awful.systems 6 points 3 months ago (1 children)

I'm not sure how much this observation can be generalized, but I've also wondered how much the people who overestimate the usefulness of AI image generators underestimate the chances of licensing decent artwork from real creatives with just a few clicks and at low cost. For example, if I'm looking for an illustration for a PowerPoint presentation, I'll usually find something suitable fairly quickly in Canva's library. That's why I don't understand why so many people believe they absolutely need AI-generated slop for this. Of course, however, Canva is participating in the AI hype now as well. I guess they have to keep their investors happy.

[–] dgerard@awful.systems 5 points 3 months ago

all the stock sites are. use case: an image that's almost perfect but you wanna tweak it

LEARN PAINT YOU GHOULS

[–] ColdWater@lemmy.ca 9 points 3 months ago

And generate shit code

[–] OmegaLemmy@discuss.online 7 points 3 months ago

For each time saved, you're having that one kink that will slow you down by a fuck ton, something that AI just can't get right, something that takes ai 5 hours to fix but would've taken you 10-20 to write from scratch

TechTakes