7.44 + 1.3 ≠ 8.23

LifeInMultipleChoice@lemmy.world · 2 days ago

7.44 + 1.3 ≠ 8.23

kryptonianCodeMonkey@lemmy.world · 11 hours ago

Predictive language models are bad at this because they are not actually parsing meaning from the text. It is just outputting patterns it has seen before from training data based on your inputs. The patterns are complex and the training data is often immense enough that is has seen just about any kind of pattern plenty. That is often good enough to get sensible output, but not always.

There are models that do handle this better through a few different strategies. They can have a team of specialized models take on the problem. Using one model to categorize the prompt and the data generated, it then has other models specifically trained on that kind of data, or even just basic stupid calculators in cases like this, parse and produce results it understands. Then it can take the output of all of those other models through one more model that organizes the data cohesively.

Alternatively, you can also have a series of models that successively breaks down the prompt and data generated into finer details and steps, where instead of guessing at math problems like this, it literally “shows its work”, so to speak, applying step by step arithmetic to it instead of just guessing with “good enough” language modeling.