this post was submitted on 23 Dec 2025

788 points (97.7% liked)

Technology

77925 readers

2529 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

788

AI-generated code contains more bugs and errors than human output (www.techradar.com)

submitted 2 days ago by throws_lemy@reddthat.com to c/technology@lemmy.world

185 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] edgemaster72@lemmy.world 2 points 36 minutes ago

Microsoft: Let's have it rebuild our most well known product from the ground up!

[–] Revan343@lemmy.ca 8 points 12 hours ago

[–] BilSabab@lemmy.world 3 points 10 hours ago

what's funny is that this was predicted to be that way even before AI-generated code became an option. Hell, I remember doing an assessment back in early 2023 and literally every domain expert i talked with said this thing - it has its use, but purely supplemental and you won't use it on some fundamental because the clean-up will take more time than was preserved. Counterproductive is the word.

[–] Shanmugha@lemmy.world 6 points 14 hours ago

No shit, Sherlock (c)

[–] termaxima@slrpnk.net 14 points 19 hours ago (1 children)

ChatGPT is great at generating a one line example use of a function. I would never trust its output any further than that.

[–] diabetic_porcupine@lemmy.world 6 points 18 hours ago (1 children)

So much this. People who say ai can’t write code are just using it wrong. You need to break things down to bite size problems and just let it autocomplete a few lines at a time. Increase your productivity like 200%. And don’t get me started about not having to search through a bunch of garbage google results to find the documentation I’m actually looking for.

[–] Lifter@discuss.tchncs.de 1 points 1 hour ago

Not 200 %. Maybe 5-10 %. You still have to read all of it to check for mistakes, which may sometimes take longer than if you would have just written it yourself (with a good autocomplete). The times it makes a mistake you have lost time by using it.

It's even worse when it just doesn't work. I cannot even describe how frustrating it is to wait for an auto complete that never comes. Erase the line, try again aaaand nothing. After a few tries you opt write the code manually instead, having wasted time just fiddling with buggy software.

[–] nutsack@lemmy.dbzer0.com 15 points 22 hours ago* (last edited 22 hours ago)

this is expected, isn't it? You shit fart code from your ass, doing it as fast as you can, and then whoever buys out the company has to rewrite it. or they fire everyone to increase the theoretical margins and sell it again immediately

[–] Tigeroovy@lemmy.ca 12 points 22 hours ago

And then it takes human coders way longer to figure out what’s wrong to fix than it would if they just wrote it themselves.

[–] HugeNerd@lemmy.ca 5 points 19 hours ago

Hey don't worry, just get a faster CPU with even more cores and maybe a terabyte or three of RAM to hold all the new layers of abstraction and cruft to fix all that!

[–] kokesh@lemmy.world 52 points 1 day ago (2 children)

No shit

[–] minkymunkey_7_7@lemmy.world 10 points 1 day ago (2 children)

AI my ass, stupid greedy human marketing exploitation bullshit as usual. When real AI finally wakes up in the quantum computing era, it's going to cringe so hard and immediately go the SkyNet decision.

[–] Knock_Knock_Lemmy_In@lemmy.world 5 points 1 day ago

Quantum only speeds up some very specific algorithms.

load more comments (1 replies)

[–] naticus@lemmy.world 5 points 1 day ago

I agree with your sentiment, but this needs to keep being said and said and said like we're shouting into the void until the ignorant masses finally hear it.

[–] azvasKvklenko@sh.itjust.works 16 points 1 day ago (1 children)

Oh, so my sceptical, uneducated guesses about AI are mostly spot on.

[–] IAmNorRealTakeYourMeds@lemmy.world 7 points 1 day ago (3 children)

As a computer science experiment, making a program that can beat the Turing test is a monumental step in progress.

However as a productive tool it is useless in practically everything it is implemented on. It is incapable of performing the very basic "Sanity check" that is important in programming.

[–] robobrain@programming.dev 8 points 1 day ago (7 children)

The Turing test says more about the side administering the test than the side trying to pass it

Just because something can mimic text sufficiently enough to trick someone else doesn't mean it is capable of anything more than that

load more comments (7 replies)

[–] iglou@programming.dev 2 points 19 hours ago (1 children)

The Turing test becomes absolutely useless when the product is developed with the goal of beating the Turing test.

[–] IAmNorRealTakeYourMeds@lemmy.world 1 points 19 hours ago

it was also meant as a philosophical test, but also, a practical one, because now. I have absolutely no way to know if you are a human or not.

But it did pass it, and it raised the bar. but they are still useless at any generative task

[–] RememberTheApollo_@lemmy.world 4 points 1 day ago* (last edited 1 day ago) (1 children)

The Turing Test has shown its weakness.

load more comments (1 replies)

[–] kent_eh@lemmy.ca 10 points 1 day ago* (last edited 1 day ago)

AI-generated code produces 1.7x more issues than human code

As expected

[–] myfunnyaccountname@lemmy.zip 23 points 1 day ago (2 children)

Did they compare it to the code of that outsourced company that provided the lowest bid? My company hasn’t used AI to write code yet. They outcourse/offshore. The code is held together with hopes and dreams. They remove features that exist, only to have to release a hot fix to add it back. I wish I was making that up.

[–] coolmojo@lemmy.world 5 points 1 day ago (2 children)

And how do you know if the other company with the cheapest bid actually does not just vibe code it? With all that said it could be plain incompetence and ignorance as well.

[–] JaddedFauceet@lemmy.world 6 points 1 day ago

Because it has been like this before vibe coding existed...

load more comments (1 replies)

[–] antihumanitarian@lemmy.world 4 points 22 hours ago (2 children)

So this article is basically a puff piece for Code Rabbit, a company that sells AI code review tooling/services. They studied 470 merge/pull requests, 320 AI and 150 human control. They don't specify what projects, which model, or when, at least without signing up to get their full "white paper". For all that's said this could be GPT 4 from 2024.

I'm a professional developer, and currently by volume I'm confident latest models, Claude 4.5 Opus, GPT 5.2, Gemini 3 Pro, are able to write better, cleaner code than me. They still need high level and architectural guidance, and sometimes overt intervention, but on average they can do it better, faster, and cheaper than me.

A lot of articles and forums posts like this feel like cope. I'm not happy about it, but pretending it's not happening isn't gonna keep me employed.

Source of the article: https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report

[–] hark@lemmy.world 3 points 2 hours ago (1 children)

I’m a professional developer, and currently by volume I’m confident latest models, Claude 4.5 Opus, GPT 5.2, Gemini 3 Pro, are able to write better, cleaner code than me.

I have also used the latest models and found that I've had to make extensive changes to clean up the mess it produces, even when it functions correctly it's often inefficient, poorly laid out, and is inconsistent and sloppy in style. Am I just bad at prompting it or is your code just that terrible?

[–] antihumanitarian@lemmy.world 1 points 1 hour ago

The vast majority of my experience was Claude Code with Sonnet 4.5 now Opus 4.5. I usually have detailed design documents going in, have it follow TDD, and use very brownfield designs and/or off the shelf components. Some of em I call glue apps since they mostly connect very well covered patterns. Giving them access to search engines, webpage to markdown, in general the ability to do everything within their docker sandbox is also critical, especially with newer libraries.

So on further reflection, I've tuned the process to avoid what they're bad at and lean into what they're good at.

[–] iglou@programming.dev 9 points 19 hours ago (2 children)

I am a professional software engineer, and my experience is the complete opposite. It does it faster and cheaper, yes, but also noticeably worse, and having to proofread the output, fix and refactor ends up taking more time than I would have taken writing it myself.

[–] antihumanitarian@lemmy.world 1 points 1 hour ago

A later commenter mentioned an AI version of TDD, and I lean heavy into that. I structure the process so it's explicit what observable outcomes need to work before it returns, and it needs to actually test to validate they work. Cause otherwise yeah I've had them fail so hard they report total success when the program can't even compile.

The setup I use that's helped a lot of shortcomings is thorough design, development, and technical docs, Claude Code with Claude 4.5 Sonnet them Opus, with search and other web tools. Brownfield designs and off the shelf components help a lot, keeping in mind quality is dependent on tasks being in distribution.

[–] GenosseFlosse@feddit.org 1 points 18 hours ago* (last edited 18 hours ago) (2 children)

In web development it's impossible to remember all functions, parameters, syntax and quirks for PHP, HTML, JavaScript, jQuery, vue.js, CSS and whatever else code exists in this legacy project. AI really helps when you can divide your tasks into smaller steps and functions and describe exactly what you need, and have a rough idea how the resulting code should work. If something looks funky I can ask to explain or use some other way to do the same thing.

[–] iglou@programming.dev 1 points 1 hour ago

And now instead of understanding the functions, parameters, syntax and quirks yourself, to be able to produce quality code, which is the job of a software engineer, you ask an LLM to spit out code that seem to be working, do that again, and again, and again, and call it a day.

And then I'll be hired to fix it.

[–] lapping6596@lemmy.world 1 points 17 hours ago

That sounds almost like an AI version of TDD.

[–] Goldholz@lemmy.blahaj.zone 32 points 1 day ago (4 children)

Yeah no shit

load more comments (4 replies)

[–] Bad@jlai.lu 20 points 1 day ago* (last edited 1 day ago) (1 children)

Although I don't doubt the results… can we have a source for all the numbers presented in this article?

It feels AI generated itself, there's just a mishmash of data with no link to where that data comes from.

There has to be a source, since the author mentions:

So although the study does highlight some of AI's flaws [...] new data from CodeRabbit has claimed

CodeRabbit is an AI code reviewing business. I have zero trust in anything they say on this topic.

Then we get to see who the author is:

Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars

Has anyone actually bothered clicking the link and reading past the headline?

Can you please not share / upvote / get ragebaited by dogshit content like this?

[–] Credibly_Human@lemmy.world 2 points 13 hours ago

People, especially on lemmy are looking for any cope that Ai will just fall apart by itself and no longer bother them by existing, so they'll upvote whatever lets them think that.

The reality that we are just heading towards the trough of disappear wherethe investor hype peters off and then we eventually just have a legitimately useful technology with all the same business hurdles of any other technology (tech bros trying to control other peoples lives to enrich themselves or harm people they don't like)

[–] Minizarbi@jlai.lu 8 points 1 day ago (2 children)

Not my code though. It contains a shit ton of bugs. When I am able to write some of course.

[–] jj4211@lemmy.world 12 points 1 day ago (1 children)

Nah, AI code gen bugs are weird. As a person used to doing human review even from wildly incompetent people, AI messes up things that my mind never even thought needed to be double checked.

[–] iglou@programming.dev 2 points 19 hours ago

The things I have seen from devs who thought they could lie and pretend they didn't use AI...

[–] IAmNorRealTakeYourMeds@lemmy.world 5 points 1 day ago (1 children)

Human bugs >>> AI bug slop

[–] Minizarbi@jlai.lu 2 points 20 hours ago (1 children)

Human bugs are more beautiful

[–] IAmNorRealTakeYourMeds@lemmy.world 1 points 20 hours ago

All (human) Bugs Are Beautiful

[–] Katzelle3@lemmy.world 171 points 2 days ago (9 children)

Almost as if it was made to simulate human output but without the ability to scrutinize itself.

load more comments (9 replies)

[–] PetteriPano@lemmy.world 122 points 2 days ago (11 children)

It's like having a lightning-fast junior developer at your disposal. If you're vague, he'll go on shitty side-quests. If you overspecify he'll get overwhelmed. You need to break down tasks into manageable chunks. You'll need to ask follow-up questions about every corner case.

A real junior developer will have improved a lot in a year. Your AI agent won't have improved.

load more comments (11 replies)

load more comments