Programming

26121 readers

930 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

UlrikHD@programming.dev

bugsmith@programming.dev

Spyro@programming.dev

587

AI still doesn't work very well, businesses are faking it, and a reckoning is coming (www.theregister.com)

submitted 1 day ago by brianpeiris@lemmy.ca to c/programming@programming.dev

141 comments fedilink hide all child comments

Excerpt:

"Even within the coding, it's not working well," said Smiley. "I'll give you an example. Code can look right and pass the unit tests and still be wrong. The way you measure that is typically in benchmark tests. So a lot of these companies haven't engaged in a proper feedback loop to see what the impact of AI coding is on the outcomes they care about. Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence."

Measures of engineering excellence, said Smiley, include metrics like deployment frequency, lead time to production, change failure rate, mean time to restore, and incident severity. And we need a new set of metrics, he insists, to measure how AI affects engineering performance.

"We don't know what those are yet," he said.

One metric that might be helpful, he said, is measuring tokens burned to get to an approved pull request – a formally accepted change in software. That's the kind of thing that needs to be assessed to determine whether AI helps an organization's engineering practice.

To underscore the consequences of not having that kind of data, Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI.

"It passed all the unit tests, the shape of the code looks right," he said. It's 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It's a dumpster fire. Throw it away. All that money you spent on it is worthless."

All the optimism about using AI for coding, Smiley argues, comes from measuring the wrong things.

"Coding works if you measure lines of code and pull requests," he said. "Coding does not work if you measure quality and team performance. There's no evidence to suggest that that's moving in a positive direction."

you are viewing a single comment's thread
view the rest of the comments

[–] Semi_Hemi_Demigod@lemmy.world 9 points 22 hours ago* (last edited 22 hours ago) (1 children)

My job has me working on AI stuff and it reminds me a lot of Internet technology back in the 90s.

For instance: I’m creating a local model to integrate with our MCP server. It took a lot of fiddling with a Modelfile for it to use the tools the MCP has installed. And it needs 20GB of VRAM to give reasonably accurate responses.

The amount of fiddling and checking and rough edges feel like writing JavaScript 1.0, or the switchover to HTML4.

Companies get a lot of praise for having AI products, but the reality isn’t nearly as flashy as they make it out to be. I’m seeing some usefulness in it as I learn more, but it’s not nearly what the hype machine says.

[–] nymnympseudonym@piefed.social -2 points 20 hours ago (2 children)

I also remember the Internet being fiddly as fuck and questionably useful during the dialup days.

AI is improving a lot faster than Internet did. It was like a decade before we got broadband and another before we had wifi.

By that logic, people shitting on AI will look very quaint in a decade or so.

[–] OpenStars@piefed.social 1 points 13 hours ago

"Why do I have to take 5 extra steps to just quickly save a file onto my computer, without needing literally everything on the cloud, especially if I am on a laptop on a device currently in airplane mode, most likely in a literal airplane in an area without reliable Internet connectivity?"

Also consider that there are places - third world nations, and so very MANY areas within supposedly "first-world" ones - that do not have reliable Internet, even today. The KISS principle still applies now, as it did back then too. Your argument screams privileged access, without acknowledging those basic precepts, including perpetual access to subscription services, which must always be maintained, e.g. even after someone retires.

And I disagree in that arguments of the form "LLMs currently do not perform better than my own human effort, in my inexperienced hands at least" will be outdated a decade from now. If LLMs get better, then they will become the musings of people who struggled with early tech before it was fully ready, which does not somehow invalidate their veracity especially in the historical sense.

[–] Semi_Hemi_Demigod@lemmy.world 2 points 20 hours ago

The Internet is and always will be fiddly. We just keep making it so easy that it looks like magic.