Programming

24153 readers

360 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

MaungaHikoi@lemmy.nz

UlrikHD@programming.dev

131

LLMS Are Not Fun (orib.dev)

submitted 2 days ago by codeinabox@programming.dev to c/programming@programming.dev

42 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] voodooattack@lemmy.world 5 points 23 hours ago (2 children)

This person is right. But I think the methods we use to train them are what’s fundamentally wrong. Brute-force learning? Randomised datasets past the coherence/comprehension threshold? And the rationale is that this is done for the sake of optimisation and the name of efficiency? I can see that overfitting is a problem, but did anyone look hard enough at this problem? Or did someone just jump a fence at the time and then everyone decided to follow along and roll with it because it “worked” and it somehow became the golden standard that nobody can question at this point?

[–] VoterFrog@lemmy.world 5 points 21 hours ago (1 children)

The generalized learning is usually just the first step. Coding LLMs typically go through more rounds of specialized learning afterwards in order to tune and focus it towards solving those types of problems. Then there's RAG, MCP, and simulated reasoning which are technically not training methods but do further improve the relevance of the outputs. There's a lot of ongoing work in this space still. We haven't seen the standard even settle yet.

[–] voodooattack@lemmy.world 4 points 13 hours ago

Yeah, but what I meant was: we took a wrong turn along the way, but now that it’s set in stone, sunk cost fallacy took over. We (as senior developers) are applying knowledge and approaches obtained through a trap we would absolutely caution and warn a junior against until the lesson sticks, because it IS a big deal.

Reminds me of this gem:

https://www.monkeyuser.com/2018/final-patch/

[–] bitcrafter@programming.dev 4 points 22 hours ago (1 children)

The researchers in the academic field of machine learning who came up with LLMs are certainly aware of their limitations and are exploring other possibilities, but unfortunately what happened in industry is that people noticed that one particular approach was good enough to look impressive and then everyone jumped on that bandwagon.

[–] voodooattack@lemmy.world 1 points 13 hours ago* (last edited 13 hours ago)

That’s not the problem though. Because if I apply my perspective I see this:

Someone took a shortcut because of an external time-crunch, left a comment about how this is a bad idea and how we should reimplement this properly later.

But the code worked and was deployed in a production environment despite the warning, and at that specific point it transformed from being “abstract procedural logic” to being “business logic”.