• stravanasu@lemmy.ca
    link
    fedilink
    English
    arrow-up
    49
    ·
    edit-2
    1 year ago

    Title:

    ChatGPT broke the Turing test

    Content:

    Other researchers agree that GPT-4 and other LLMs would probably now pass the popular conception of the Turing test. […]

    researchers […] reported that more than 1.5 million people had played their online game based on the Turing test. Players were assigned to chat for two minutes, either to another player or to an LLM-powered bot that the researchers had prompted to behave like a person. The players correctly identified bots just 60% of the time

    Complete contradiction. Trash Nature, it’s become only an extremely expensive gossip science magazine.

    PS: The Turing test involves comparing a bot with a human (not knowing which is which). So if more and more bots pass the test, this can be the result either of an increase in the bots’ Artificial Intelligence, or of an increase in humans’ Natural Stupidity.

    • Marxism-Fennekinism@lemmy.ml
      link
      fedilink
      arrow-up
      7
      ·
      edit-2
      1 year ago

      Also, the Turing Test isn’t some holy grail of AI. It’s just a thought experiment, and not even the highest test for an AI that we can think of. Passing it is impressive don’t get me wrong, but unlike what clickbait articles would tell you, it does not automatically mean an AI is sentient or is smarter than humans or anything like that. It means it passed the thought experiment, nothing more.

      Also also, ChatGPT was not the first AI to pass the Turing Test. Actually, plenty have, even over a decade before.

  • Peanut@sopuli.xyz
    link
    fedilink
    arrow-up
    18
    ·
    edit-2
    1 year ago

    Funny I don’t see much talk in this thread about Francois Chollet’s abstraction and reasoning corpus, which is emphasised in the article. It’s a really neat take on how to understand the ability of thought.

    A couple things that stick out to me about gpt4 and the like are the lack of understanding in the realms that require multimodal interpretations, the inability to break down word and letter relationships due to tokenization, lack of true emotional ability, and similarity to the “leap before you look” aspect of our own subconscious ability to pull words out of our own ass. Imagine if you could only say the first thing that comes to mind without ever thinking or correcting before letting the words out.

    I’m curious about what things will look like after solving those first couple problems, but there’s even more to figure out after that.

    Going by recent work I enjoy from Earl K. Miller, we seem to have oscillatory cycles of thought which are directed by wavelengths in a higher dimensional representational space. This might explain how we predict and react, as well as hold a thought to bridge certain concepts together.

    I wonder if this aspect could be properly reconstructed in a model, or from functions built around concepts like the “tree of thought” paper.

    It’s really interesting comparing organic and artificial methods and abilities to process or create information.

  • bedrooms@kbin.social
    link
    fedilink
    arrow-up
    12
    ·
    1 year ago

    Honestly, though, I even can’t decide whether other people have consciousness. Cogito ergo sum, if you know what I’m talking about.

  • NuPNuA@lemm.ee
    link
    fedilink
    arrow-up
    6
    ·
    1 year ago

    What about the Voight-Kampff test? What would it do if it sees a turtle in the dessert?

    • Droggl@lemmy.sdf.org
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I dont remember the numbers but iirc it was covered by one of the validation datasets and GPT 4 did quite well on it

      • Maestro@kbin.social
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        1 year ago

        Yeah, but did it do well on the specific examples from the Winograd paper? Because ChatGPT probably just learned those since they are well known and oft repeatef. Or does it do well on brand new sentences made according to the Winograd scheme?