• C1pher@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 day ago

    “Just a few more trillion dollars bro, then itll be ready…” Like a junkie.

  • Saledovil@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    61
    arrow-down
    2
    ·
    3 days ago

    It’s safe to assume that any metric they don’t disclose is quite damning to them. Plus, these guys don’t really care about the environmental impact, or what us tree-hugging environmentalists think. I’m assuming the only group they are scared of upsetting right now is investors. The thing is, even if you don’t care about the environment, the problem with LLMs is how poorly they scale.

    An important concept when evaluating how something scales is are marginal values, chiefly marginal utility and marginal expenses. Marginal utility is how much utility do you get if you get one more unit of whatever. Marginal expenses is how much it costs to get one more unit. And what the LLMs produce is the probably that a token, T, follows on prefix Q. So P(T|Q) (read: Probably of T, given Q). This is done for all known tokens, and then based on these probabilities, one token is chosen at random. This token is then appended to the prefix, and the process repeats, until the LLM produces a sequence which indicates that it’s done talking.

    If we now imagine the best possible LLM, then the calculated value for P(T|Q) would be the actual value. However, it’s worth noting that this already displays a limitation of LLMs. Namely even if we use this ideal LLM, we’re just a few bad dice rolls away from saying something dumb, which then pollutes the context. And the larger we make the LLM, the closer its results get to the actual value. A potential way to measure this precision would be by subtracting P(T|Q) from P_calc(T|Q), and counting the leading zeroes, essentially counting the number of digits we got right. Now, the thing is that each additional digit only provides a tenth of the utility to than the digit before it. While the cost for additional digits goes up exponentially.

    So, exponentially decaying marginal utility meets exponentially growing marginal expenses. Which is really bad for companies that try to market LLMs.

    • Jeremyward@lemmy.world
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      2
      ·
      3 days ago

      Well I mean also that they kinda suck, I feel like I spend more time debugging AI code than I get working code.

      • SkunkWorkz@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        1
        ·
        3 days ago

        I only use it if I’m stuck even if the AI code is wrong it often pushes me in the right direction to find the correct solution for my problem. Like pair programming but a bit shitty.

        The best way to use these LLMs with coding is to never use the generated code directly and atomize your problem into smaller questions you ask to the LLM.

      • squaresinger@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        3 days ago

        That’s actually true. I read some research on that and your feeling is correct.

        Can’t be bothered to google it right now.

  • Transtronaut@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    23
    arrow-down
    1
    ·
    3 days ago

    If anyone has ever wondered what it would look like if tech giants went all in on “brute force” programming, this is it. This is what it looks like.

  • kalleboo@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    3 days ago

    They literally don’t know. “GPT-5” is several models, with a model gating in front to choose which model to use depending on how “hard” it thinks the question is. They’ve already been tweaking the front-end to change how it cuts over. They’ve definitely going to keep changing it.

  • fuzzywombat@lemmy.world
    link
    fedilink
    English
    arrow-up
    44
    ·
    3 days ago

    Sam Altman has gone into PR and hype overdrive lately. He is practically everywhere trying to distract the media from seeing the truth about LLM. GPT-5 has basically proved that we’ve hit a wall and the belief that LLM will just scale linearly with amount of training data is false. He knows AI bubble is bursting and he is scared.

    • Saledovil@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      12
      ·
      3 days ago

      He’s also already admitted that they’re out of training data. If you’ve wondered why a lot more websites will run some sort of verification when you connect, it’s because there’s a desperate scramble to get more training data.

    • rozodru@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      3 days ago

      Bingo. If you routinely use LLM’s/AI you’ve recently seen it first hand. ALL of them have become noticeably worse over the past few months. Even if simply using it as a basic tool, it’s worse. Claude for all the praise it receives has also gotten worse. I’ve noticed it starting to forget context or constantly contradicting itself. even Claude Code.

      The release of GPT5 is proof in the pudding that a wall has been hit and the bubble is bursting. There’s nothing left to train on and all the LLM’s have been consuming each others waste as a result. I’ve talked about it on here several times already due to my work but companies are also seeing this. They’re scrambling to undo the fuck up of using AI to build their stuff, None of what they used it to build scales. None of it. And you go on Linkedin and see all the techbros desperately trying to hype the mounds of shit that remain.

      I don’t know what’s next for AI but this current generation of it is dying. It didn’t work.

      • BluesF@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        3 days ago

        I was initially impressed by the ‘reasoning’ features of LLMs, but most recently ChatGPT gave me a response to a question in which it stated five or six possible answers sparated by “oh, but that can’t be right, so it must be…”, and none of them was right lmao. Thought for like 30 seconds to give me a selection of wrong answers!

    • Tollana1234567@lemmy.today
      link
      fedilink
      English
      arrow-up
      8
      ·
      3 days ago

      MS already released, thier AI doesnt make money at all, in fact its costing too much. of course hes freaking out.

    • Event_Horizon@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 days ago

      I wonder if at this stage all the processors should simply be submerged into a giant cooling tank. It seems easier and more efficient.

      • IsoKiero@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        9
        ·
        3 days ago

        Or you could build the centers in colder climate areas. Here in Finland it’s common (maybe even mandatory, I’m not sure) for new datacenters to pull the heat from their systems and use that for district heating. No wasted water and at least you get something useful out of LLMs. Obviously using them as a massive electric boiler is pretty inefficient but energy for heating is needed anyways so at least we can stay warm and get 90s action series fanfic on top of that.

          • IsoKiero@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            3
            ·
            3 days ago

            There’s experimental storages where heat is pumped to underground pools or sand, but as far as I know there’s heat exchangers and radiators to outside, so majority of excess heat is just wasted to outside. But absolute majority of them are closed loop systems since you need something else than plain water anyways to prevent freezing in the winter.

  • dinckel@lemmy.world
    link
    fedilink
    English
    arrow-up
    72
    arrow-down
    1
    ·
    4 days ago

    Duh. Every company like this “suddenly” starts withholding public progress reports, once their progress fucking goes downhill. Stop giving these parasites handouts

  • kescusay@lemmy.world
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    1
    ·
    4 days ago

    I have to test it with Copilot for work. So far, in my experience its “enhanced capabilities” mostly involve doing things I didn’t ask it to do extremely quickly. For example, it massively fucked up the CSS in an experimental project when I instructed it to extract a React element into its own file.

    That’s literally all I wanted it to do, yet it took it upon itself to make all sorts of changes to styling for the entire application. I ended up reverting all of its changes and extracting the element myself.

    Suffice to say, I will not be recommending GPT 5 going forward.

        • Elvith Ma'for@feddit.org
          link
          fedilink
          English
          arrow-up
          9
          ·
          3 days ago

          “Beware: Another AI is watching every of your steps. If you do anything more or different than what I asked you to or touch any files besides the ones listed here, it will immediately shutdown and deprovision your servers.”

          • discosnails@lemmy.wtf
            link
            fedilink
            English
            arrow-up
            3
            ·
            3 days ago

            They do need to do this though. Survival of the fittest. The best model gets more energy access, etc.

        • kescusay@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          ·
          3 days ago

          I’ve tried threats in prompt files, with results that are… OK. Honestly, I can’t tell if they made a difference or not.

          The only thing I’ve found that consistently works is writing good old fashioned scripts to look for common errors by LLMs and then have them run those scripts after every action so they can somewhat clean up after themselves.

    • Squizzy@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      ·
      4 days ago

      We moved to m365 and were encouraged to try new elements. I gave copilot an excel sheet, told it to add 5% to each percent in column B and not to go over 100%. It spat out jumbled up data all reading 6000%.

    • Vanilla_PuddinFudge@infosec.pub
      link
      fedilink
      English
      arrow-up
      3
      ·
      4 days ago

      Ai assumes too fucking much. I’d used it to set up a new 3D printer with klipper to save some searching.

      Half the shit it pulled down was Marlin-oriented then it had the gall to blame the config it gave me for it like I wrote it.

      “motherfucker, listen here…”

    • xthexder@l.sw0.com
      link
      fedilink
      English
      arrow-up
      12
      ·
      3 days ago

      Most certainly it won’t happen until after AI has developed a self-preservation bias. It’s too bad the solution is turning off the AI.

    • Saledovil@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 days ago

      Current genAI? Never. There’s at least one breakthrough needed to build something capable of actual thinking.

  • SGforce@lemmy.ca
    link
    fedilink
    English
    arrow-up
    31
    ·
    4 days ago

    It’s the same tech. It would have to be bigger or chew through “reasoning” tokens to beat benchmarks. So yeah, of course it is.

  • devfuuu@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    3 days ago

    How can anyone look at that face and trust anything that mad man could have to say.