That being said… LLMs are still language model, they make prediction statistics on the next word (or token) but in practice, despite names like “hallucinations” or “reasoning” or labels like “thinking” they are not doing any reasoning, have no logic to do e.g. fact checks. I would expect journalists to pay a LOT of attention to distinguish between facts and fictions, between speculation and historical elements, between propaganda, popular ideas and actual events.
So… even if were to magically solve outdated datasets that still doesn’t solve the linchpin, namely models are NOT thinking.
AFAIK that’s what RAG https://en.wikipedia.org/wiki/Retrieval-augmented_generation is all about namely the dataset is what it is but you try to extend it with your own data that can be brand new and even stay private.
That being said… LLMs are still language model, they make prediction statistics on the next word (or token) but in practice, despite names like “hallucinations” or “reasoning” or labels like “thinking” they are not doing any reasoning, have no logic to do e.g. fact checks. I would expect journalists to pay a LOT of attention to distinguish between facts and fictions, between speculation and historical elements, between propaganda, popular ideas and actual events.
So… even if were to magically solve outdated datasets that still doesn’t solve the linchpin, namely models are NOT thinking.
I am choosing to pronounce your username utter pyre.
🔥.