Hey all,

I’m losing my wits over not being able to find an article again that I’ve seen either on lemmy or reddit, and I figured this community might know.

It was about a research team somehow cracking open an LLM and looking at the way it does calculations, and I remember there was a sort of flowchart in the article, with the LLM grouping interim results into weird-ass categories, like “between 26-ish and 34-ish”, and then using a separate process for figuring out the last digit.

I think the article might have been linked in response to a question like why LLMs mess up the last digit of number calculations.

Any of this rings a bell to someone? I’ve tried searching for it in any way I can phrase the idea, but all I get is a flood of ads and guides about “how to do math in LLMs”.

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    edit-2
    3 hours ago

    That’s not entirely correct. They kinda “do maths”. I tried to google OP’s answer, and there’s a bunch of papers showcasing how LLMs develop circuits to handle numbers. (I didn’t find that specific one, though.) Of course everything is prediction with LLMs. But seems they try to form a model how to do base-10 maths. Surely they’re bad at it and not a real calculator. And you’re right. What people usually tend to do is give them tool access. Either to a proper calculator, or more often a Python sandbox and there will be a prompt to write a Python snippet to do arithmetic. But the usual models can also add and multiply smaller numbers without anything in the background. That’s not really an achievement, they can simply memorize the basic multiplication tables.

    • James R Kirk@startrek.website
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 hours ago

      Correct me if I’m wrong but what you’re describing still sounds like a probabilistic output, right? Meaning it’s not the same output every time (meaning it can’t actually be doing math).

      • hendrik@palaver.p3x.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        36 minutes ago

        The randomization comes into effect later. The model weights itself don’t change, that’s just some numbers and they get multiplied. So it will always be exactly 94% certain that 5 times 12 equals 60. (Numbers entirely made up by me. And I’m oversimplifying.)

        I think what you mean is the sampler and for example the temperature setting. That is added on top and switches things up and occasionally makes the LLM output something that’s not the highest confidence token. And you’re right. Cranking up temperature would make it output more random answers. But if you use for example ChatGPT on default settings, it should almost always give the correct answer to very basic arithmetic questions with low-ish numbers. I’ve never seen it do anything else. And you can always set temperature to zero and the sampler will give you deterministic output. So always the same for same input.

        But with that said, I also tried decimal numbers, large values or proper equations and trigonometry or divisions. And ChatGPT will definitely not be able to give reliable answers. It’s kind of surprising (to me) that it sometimes seems to pull it off, or at least have some vague idea where to go. But seems to me, elementary school level is the limit.

        What the papers say is that there’s more inside. So they don’t just memorize or resort to random guessing. But there’s actually more inside. But I’ve just skimmed those papers, so I don’t know the exact details, seems they try to form some “understanding” of how addition works. We know they’re not specifically made to be calculators and from my experience they’re not good at it. But they’re not always just rolling dice either.

        (And transformers-based large language models (plus added memory) are Turing complete so… theoretically they could be an accurate calculator 😂 just an absurdly idiotic and wasteful one…)

        Ultimately all of this is hard to compare to how a human does maths. I also memorized my multiplication tables, but other than that I do several steps in my head, pretty much how I learned it in school. A LLM not so much, we’d have to properly read the papers to find out how they do it, but it probably inferred different ways to give answers… Unless we’re talking about the “reasoning” modes, but I don’t think they do proper reasoning as of today.