• megopie@beehaw.org
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    1 day ago

    The really crazy part is that it’s been like that for 4 years now, the models have improved based on arbitrary metrics the people making the models have decided upon, but in terms of real world usability they’re basically the same. Marginal improvement from running it twice to have it check its self, but only a marginal improvement by doubling the compute.

    It’s insanity that they’re burning billions upon billions to keep this charade going.

    At least with the dot com bubble there was a clear and obtainable use case to match the amount of hype. It got over invested in before it was fully ready, but there was at least an obvious path to something worth the cost of running.

    • AbelianGrape@beehaw.org
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      21 hours ago

      This is definitely true for code but in terms of information retrieval and explaining complex topics, they have gotten much better in the sense that they can cite real sources (with links) now.

      The analysis and synthesis that they do of those sources is still often bogus though. I’ve had one explain some simple Magic the Gathering rules with real-looking words but completely bogus interpretations and conclusions, but it did cite the correct rulebook with a link. I’ve also had one give a pretty strong overview of the construction and underlying theory of a particular compiler (a specific compiler, not the language it compiles) that matches up quite well with my own fairly deep understanding of that compiler.

      Overall the real information is better, but the hallucinations look more real too. And they’re still pretty unhelpful for programming in my experience.

      • megopie@beehaw.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        11 hours ago

        A marginal improvement for a limited use case, and often by having it generate a critical response to the output internally and use that to weed out particularly glaring errors, which, increases the amount of compute used significantly.

        Not a revolutionary jump forward in capability. not a trillion dollar industry that justifies this level of investment or obsession.