AI hallucinations are getting worse – and they're here to stay

solo@slrpnk.net · 9 months ago

AI hallucinations are getting worse – and they're here to stay

Lvxferre [he/him]@mander.xyz · 9 months ago

The whole thing can be summed up as the following: they’re selling you a hammer and telling you to use it with screws. Once you hammer the screw, it trashes the wood really bad. Then they’re calling the wood trashing “hallucination”, and promising you better hammers that won’t do this. Except a hammer is not a tool to use with screws dammit, you should be using a screwdriver.

An AI leaderboard suggests the newest reasoning models used in chatbots are producing less accurate results because of higher hallucination rates.

So he’s suggesting that the models are producing less accurate results… because they have higher rates of less accurate results? This is a tautological pseudo-explanation.

AI chatbots from tech companies such as OpenAI and Google have been getting so-called reasoning upgrades over the past months

When are people going to accept the fact that large “language” models are not general intelligence?

ideally to make them better at giving us answers we can trust

Those models are useful, but only a fool trusts = is gullible towards their output.

OpenAI says the reasoning process isn’t to blame.

Just like my dog isn’t to blame for the holes in my garden. Because I don’t have a dog.

This is sounding more and more like model collapse - models perform worse when trained on the output of other models.

inb4 sealions asking what’s my definition of reasoning in 3…2…1…

reksas@sopuli.xyz · 9 months ago

ai is just too nifty word even if its gross misuse of the term. large language model doesnt roll of the tongue as easily.

vintageballs@feddit.org · 9 months ago

The goalpost has shifted a lot in the past few years, but in the broader and even narrower definition, current language models are precisely what was meant by AI and generally fall into that category of computer program. They aren’t broad / general AI, but definitely narrow / weak AI systems.

I get that it’s trendy to shit on LLMs, often for good reason, but that should not mean we just redefine terms because some system doesn’t fit our idealized under-informed definition of a technical term.

reksas@sopuli.xyz · 9 months ago

well, i guess i can stop feeling like i’m using wrong word for them then