If AI is making the Turing test obsolete, what might be better?

Greenpepper@beehaw.org · 9 months ago

If AI is making the Turing test obsolete, what might be better?

kbal@fedia.io · edit-2 9 months ago

The idea that “a computer would deserve to be called intelligent if it could deceive a human into believing that it was human” was already obsolete 50 years ago with ELIZA. Clever though it was, examining the source code made it clear that it did not deserve to be called intelligent any more than does today’s average toaster.

And then more recently, the ever-evolving chatbots have made it increasingly difficult to administer a meaningful Turing test over the past 30 years as well. It requires care and expertise. It can’t be automated, and it can’t be done by the average person who hasn’t been specifically trained in it. They’re much better at fooling people who’ve never talked to one before, but I think someone with lots of practice identifying the bots of 2013 would still have not much trouble catching out those of today.

admiralteal@kbin.social · 9 months ago

It cannot be automated or systematized because neural networks are the tool you use to defeat systems like that. If there’s a defined, objective test, a neural network can train for/on that test and ‘learn’ to ace it. It’s just what they do.

The only way to test for ‘true’ intelligence would be to perfectly define it first, such that when the NN aced the test that would prove intelligence. That is, IF you could perfectly define intelligence, doing so would more or less give you all the tools you needed to create it.

All these people claiming we already have general AI or even anything like it have put the cart so far before the horse.

jarfil@beehaw.org · 9 months ago

If a neural network can do it, then a neural network can do it… so we either have to accept that a neural network can be intelligent, or that no human can be intelligent.

If we accept that human NNs can be intelligent, then the only remaining question is how to compare a human NN to a machine NN.

Right now, the analysis of LLMs shows that they present: human-like categorization, super-human knowledge, and sub-human reasoning. So, depending on the measure, current LLMs can fall anywhere on the scale of “not AGI” to “AGI overlord”. It’s reasonable to expect larger models, with more multimodal training, to become fully “AGI overlord” by all measures in the very near future.

admiralteal@kbin.social · 9 months ago

Don’t buy into the techbro nonsense. Just because they’re called “neural networks” does not mean they work the same way the human brain does. We don’t know how the human brain fundamentally processes data so anyone telling you these NNs work in a way that is the same as blowing wind out their ass.

jarfil@beehaw.org · 9 months ago

There was this book called “artificial intelligence” we had on CS something like 20 years ago, which started by analyzing in detail how biological neurons worked in the first few chapters… so maybe you’ll call me a “techbro” and just dismiss all I say, or read far enough to understand that these NNs are mimicking the behavior of actual neurons in a human brain.

We can discuss whether the higher level structures and processes are similar and to what degree, or whether the digital models represent the biological versions more or less accurately, but you can’t deny that the building blocks are replicating the human brain behavior at some level, because that’s exactly what they have been designed to do.

Froyn@kbin.social · 9 months ago

Voight-Kampff test maybe?

Imagine someone asked you “If Desk plus Love equals Fruit, why is turtle blue?”
AI will actually TRY to solve it.
Human nature would be to ask if the person asking the question is having a stroke or requires medical attention.

Pamasich@kbin.social · 9 months ago

So, I asked this to the three different conversation styles of Bing Chat.

The Precise style actually tried to solve it, came to the conclusion the question might be of philosophical nature, including some potential meanings, and asked for clarification.

The Balanced style told me basically the same as the other reply by admiralteal, that the question makes no sense and I should give more context if I actually want it answered.

The Creative style told me it didn’t understand the first part, but then answered the second part (the turtles being blue) seriously.

Froyn@kbin.social · 9 months ago

Would it be safe to say that all 3 answers would fail the test?

Pamasich@kbin.social · 9 months ago

Not sure, I’m not familiar with the test, just figured I’d tell the results from asking the AI.

I think based on what you said about it

AI will actually TRY to solve it.
Human nature would be to ask if the person asking the question is having a stroke or requires medical attention.

That the Balanced style didn’t fail, because while it didn’t ask about strokes or medical attention, it did point out I’m asking a nonsense question and refused to engage with it.

The Precise style did try to find an answer and the Creative style didn’t realize I’m fucking with it, so I do think based on the criteria they’d fail the test.

Though, honestly, I’d fail the test too. When asked such a question, I’d think there has to be an answer and it’s stupid of me not to see it and I’d look for it. I think the Precise style’s answer is very much where I’d end up.

admiralteal@kbin.social · 9 months ago

Nope, ChatGPT tells you it is a nonsequitor and asks for more context or intention if the question is sincere.

Froyn@kbin.social · 9 months ago

You’re saying the test would work.
In 43+ years on this planet I’ve never HEARD someone seriously use “non sequitur” properly in a sentence.
Asking if the intention is sincere would be another flag given the circumstances (knowing they were being tested).

Toss in a couple real questions like: “What is the 42nd digit of pi?”, “What is the square root of -i ?”, and you’d find the AI pretty quick.

admiralteal@kbin.social · edit-2 9 months ago

Cool.

Both the phrases you’re calling out as clearly AI came from me. Not used by ChatGPT, just how I summarized its response. I wonder if this is the first time someone has brazenly accused me of being an AI bot?

Froyn@kbin.social · 9 months ago

LoL, no I took you at your word which was my mistake
“ChatGPT tells you” read to me like you attempted and got that response.

pbjamm@beehaw.org · 9 months ago

Both the phrases you’re calling out as clearly AI came from me.

Perhaps you are an instance of an LLM and do not realize it.

jarfil@beehaw.org · 9 months ago

“If Desk plus Love equals Fruit, why is turtle blue?”

Assuming “Desk = x”, “Love = y”, “Fruit = x+y”, and “turtle blue = z”, it is so because you assigned arbitrary values to the words such that they fulfill the equation.

Am I an AI?

lily33@lemm.ee · edit-2 9 months ago

I disagree with the “limitations” they ascribe to the Turing test - if anything, they’re implementation issues. For example:

For instance, any of the games played during the test are imitation games designed to test whether or not a machine can imitate a human. The evaluators make decisions solely based on the language or tone of messages they receive.

There’s absolutely no reason why the evaluators shouldn’t take the content of the messages into account, and use it to judge the reasoning ability of whoever they’re chatting with.

The Doctor@beehaw.org · 9 months ago

The Turing test has been obsolete for better than two decades. The premise of this article is incorrect.

Thorny_Insight@lemm.ee · 9 months ago

Ironically GPT4 fails the turing test for having so wide knowledge about almost everything that you just know it’s not a human you’re talking to.

flatbield@beehaw.org · 9 months ago

The problem with AI is that it does not understand anything. You can have a completely reasonable sounding conversation that is just full of stupidity and the AI does not know it because it does not no anything.

Another AI issue is it works until it does not and that failure can be rather severe and unexpected. Again because the AI knows nothing.

Seems like we need some test to address this. They are basically the same problem. Or maybe it is some training so that the AI can know what it does not know.

intensely_human@lemm.ee · 9 months ago

Define “understand” as you’re using it here? What exactly does the AI not do, that humans do, that comprises “understanding”?

flatbield@beehaw.org · edit-2 9 months ago

Understanding the general sanity of some of their responses. Synthesizing new ideas. Having a larger context. AI tends to be idiot savants on one hand and really mediocre on the other.

You could argue that this is just a reflection of lack of training and scale but I wonder.

You will change my mind when I have had a machine interaction where the machine does not seem like an idiot.

Edit: AI people call the worst of these hallucinations but they are just nonsensical stuff that proves AI knows nothing and are just dumb correlation engines.

0ops@lemm.ee · edit-2 9 months ago

AI knows nothing and are just dumb correlation engines

Here’s a thought exercise, how do you “know”? How do you know your pet? LLMs like gpt can “know” about a dog in terms of words, because that’s what they “sense”, that’s how they interact with their “environment”. They understand words and how they relate to other words, basically words are their entire environment.

Now, can you describe how you know your dog without your senses, or anything derived from your senses? Remember, chemical receptors are “senses” too.

I remember reading about this awhile back but I don’t have the link on me: Did you know that people who were born blind but have their vision repaired years later don’t immediately know what “pointy” looks like? They never formed that correlation between the feeling of pointy and the visual of pointy the way that they could with the feeling and the word.

My point is, we’re correlation machines too

intensely_human@lemm.ee · 9 months ago

Have you ever interacted with a human that seemed like an idiot? Do you think that person is incapable of understanding?

flatbield@beehaw.org · 9 months ago

Most humans are not very intelligent either and many lack the ability to understand many things. We are not really thinking machines. We are emotional creatures that some times think. So I would not measure AI against the average human. That is a pretty low bar.

intensely_human@lemm.ee · 9 months ago

The point of logic is to carry you when your emotions try to stop you from thinking.

Yes AI is scary. No, that doesn’t mean we get to through out our definition of AI in order to avoid recognizing its presence.

FaceDeer@kbin.social · 9 months ago

I’m reminded of the apocryphal Ghandi quote “first they ignore you, then they laugh at you, then they fight you, then you win.” It seems like the general zeitgeist is in between the laugh/fight stages for AI right now.

intensely_human@lemm.ee · 9 months ago

It’s just too scary to acknowledge. Same thing with aliens. They’re both horrifying literally beyond imagination, and both for the same reason, and so it’s more natural to avoid acknowledging it.

Everything we’ve ever known is a house of cards and it’s terrifying to bring that to awareness.

AutoTL;DR@lemmings.world · 9 months ago

🤖 I’m a bot that provides automatic summaries for articles:

Click here to see the summary

To try to answer this question, a team of researchers has proposed a novel framework that works like a psychological study for software.

This is why the Turing Test may no longer be relevant, and there is a need for new evaluation methods that could effectively assess the intelligence of machines, according to the researchers.

During the Turing Test, evaluators play different games involving text-based communications with real humans and AI programs (machines or chatbots).

The same applies to AI as well, according to a study from Stanford University which suggests that machines that could self-reflect are more practical for human use.

“AI agents that can leverage prior experience and adapt well by efficiently exploring new or changing environments will lead to much more adaptive, flexible technologies, from household robotics to personalized learning tools,” Nick Haber, an assistant professor from Stanford University who was not involved in the current study, said.

It doesn’t tell us anything about what a system can do or understand, anything about whether it has established complex inner monologues or can engage in planning over abstract time horizons, which is key to human intelligence,” Mustafa Suleyman, an AI expert and founder of DeepAI, told Bloomberg.

Saved 73% of original text.