• Twipped@l.twipped.social
    link
    fedilink
    arrow-up
    12
    ·
    1 day ago

    Given that LLMs are always at least several months behind reality in their training, and the data they’re training on is content produced by real journalists, I really don’t see how it could EVER act as a journalist. I’m not sure it could even interview someone reliably.

    • AeonFelis@lemmy.world
      link
      fedilink
      arrow-up
      13
      ·
      1 day ago
      • “Why did you take that bribe?”
      • “I did not take any bribes!”
      • “You are absolutely correct! You did not take any bribes”
    • utopiah@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      1 day ago

      AFAIK that’s what RAG https://en.wikipedia.org/wiki/Retrieval-augmented_generation is all about namely the dataset is what it is but you try to extend it with your own data that can be brand new and even stay private.

      That being said… LLMs are still language model, they make prediction statistics on the next word (or token) but in practice, despite names like “hallucinations” or “reasoning” or labels like “thinking” they are not doing any reasoning, have no logic to do e.g. fact checks. I would expect journalists to pay a LOT of attention to distinguish between facts and fictions, between speculation and historical elements, between propaganda, popular ideas and actual events.

      So… even if were to magically solve outdated datasets that still doesn’t solve the linchpin, namely models are NOT thinking.