• Lvxferre [he/him]@mander.xyz
    link
    fedilink
    arrow-up
    34
    ·
    9 months ago

    The whole thing can be summed up as the following: they’re selling you a hammer and telling you to use it with screws. Once you hammer the screw, it trashes the wood really bad. Then they’re calling the wood trashing “hallucination”, and promising you better hammers that won’t do this. Except a hammer is not a tool to use with screws dammit, you should be using a screwdriver.

    An AI leaderboard suggests the newest reasoning models used in chatbots are producing less accurate results because of higher hallucination rates.

    So he’s suggesting that the models are producing less accurate results… because they have higher rates of less accurate results? This is a tautological pseudo-explanation.

    AI chatbots from tech companies such as OpenAI and Google have been getting so-called reasoning upgrades over the past months

    When are people going to accept the fact that large “language” models are not general intelligence?

    ideally to make them better at giving us answers we can trust

    Those models are useful, but only a fool trusts = is gullible towards their output.

    OpenAI says the reasoning process isn’t to blame.

    Just like my dog isn’t to blame for the holes in my garden. Because I don’t have a dog.

    This is sounding more and more like model collapse - models perform worse when trained on the output of other models.

    inb4 sealions asking what’s my definition of reasoning in 3…2…1…

    • reksas@sopuli.xyz
      link
      fedilink
      arrow-up
      4
      ·
      9 months ago

      ai is just too nifty word even if its gross misuse of the term. large language model doesnt roll of the tongue as easily.

      • vintageballs@feddit.org
        link
        fedilink
        Deutsch
        arrow-up
        2
        ·
        9 months ago

        The goalpost has shifted a lot in the past few years, but in the broader and even narrower definition, current language models are precisely what was meant by AI and generally fall into that category of computer program. They aren’t broad / general AI, but definitely narrow / weak AI systems.

        I get that it’s trendy to shit on LLMs, often for good reason, but that should not mean we just redefine terms because some system doesn’t fit our idealized under-informed definition of a technical term.