this post was submitted on 10 Sep 2025

912 points (99.2% liked)

Fuck AI

4230 readers

702 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 2 years ago

MODERATORS

VerbFlow@lemmy.world

MrMcGasion@lemmy.world

TootSweet@lemmy.world

BigMikeInAustin@lemmy.world

cynar@lemmy.world

drmeanfeel@lemmy.world

pavnilschanda@lemmy.world

CriticalMedicine@lemmy.world

WonderfulWanderer@lemmy.world

Communist@lemmy.ml

eatCasserole@lemmy.world

SpaceNoodle@lemmy.world

NutWrench@lemmy.world

Soup@lemmy.cafe

iAvicenna@lemmy.world

Tinks@lemmy.world

wizblizz@lemmy.world

corus_kt@lemmy.world

Prandom_returns@lemm.ee

JimSamtanko@lemm.ee

TrickDacy@lemmy.world

TheFriar@lemm.ee

ArmokGoB@lemmy.dbzer0.com

HawlSera@lemm.ee

andrew_bidlaw@sh.itjust.works

MeDuViNoX@sh.itjust.works

33550336@lemmy.world

Nougat@fedia.io

Lost_My_Mind@lemmy.world

Sterile_Technique@lemmy.world

Quill7513@slrpnk.net

glowing_hans@sopuli.xyz

e8d79@discuss.tchncs.de

ThefuzzyFurryComrade@pawb.social

912

also dead internet is probably true, oh well (infosec.pub)

submitted 3 weeks ago by not_IO@lemmy.blahaj.zone to c/fuck_ai@lemmy.world

54 comments fedilink hide all child comments

https://tldr.nettime.org/@tante/115179244815412442

you are viewing a single comment's thread
view the rest of the comments

[–] ech@lemmy.ca 40 points 3 weeks ago* (last edited 3 weeks ago) (3 children)

Took a look cause, as frustrating as it'd be, it'd still be a step in the right direction. But no, they're still adamant that it's just a "quirk".

Conclusions

We hope that the statistical lens in our paper clarifies the nature of hallucinations and pushes back on common misconceptions:

Claim: Hallucinations will be eliminated by improving accuracy because a 100% accurate model never hallucinates. Finding: Accuracy will never reach 100% because, regardless of model size, search and reasoning capabilities, some real-world questions are inherently unanswerable.

Claim: Hallucinations are inevitable. Finding: They are not, because language models can abstain when uncertain.

Claim: Avoiding hallucinations requires a degree of intelligence which is exclusively achievable with larger models. Finding: It can be easier for a small model to know its limits. For example, when asked to answer a Māori question, a small model which knows no Māori can simply say “I don’t know” whereas a model that knows some Māori has to determine its confidence. As discussed in the paper, being “calibrated” requires much less computation than being accurate.

Claim: Hallucinations are a mysterious glitch in modern language models. Finding: We understand the statistical mechanisms through which hallucinations arise and are rewarded in evaluations.

Claim: To measure hallucinations, we just need a good hallucination eval. Finding: Hallucination evals have been published. However, a good hallucination eval has little effect against hundreds of traditional accuracy-based evals that penalize humility and reward guessing. Instead, all of the primary eval metrics need to be reworked to reward expressions of uncertainty.

Infuriating.

[–] wewbull@feddit.uk 7 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

Finding: Accuracy will never reach 100% because, regardless of model size, search and reasoning capabilities, some real-world questions are inherently unanswerable.

Translation: PEBKAC. You asked the wrong question.

[–] jherazob@fedia.io 6 points 3 weeks ago (1 children)

Basically "You must be prompting it wrong!"

[–] fading_person@lemmy.zip 1 points 3 weeks ago

We don't have mathematical proof that this technology will find adequate results. You must have faith in the technology.

[–] Voroxpete@sh.itjust.works 2 points 3 weeks ago

Got a link or a title I can google to find the full paper? I'd be really interested in reading it.

[–] lets_get_off_lemmy@reddthat.com 0 points 3 weeks ago* (last edited 3 weeks ago)

This further points to the solution being smaller models that know less and are trained for smaller tasks. Instead of gargantuan models that require an insane amount of resources to answer easy questions. Route queries to smaller, more specialized models, based on queries. This was the motivation behind MoE models, but I think there are other architectures and paradigms to explore.