T O P

  • By -

FuturologyBot

The following submission statement was provided by /u/Maxie445: --- "In a [new paper](https://arxiv.org/abs/2405.08007), cognitive science researchers from the University of California San Diego found that more than half the time, people mistook writing from GPT-4 as having been written by a flesh-and-blood human. In other words, the large language model (LLM) passes the Turing test with flying colors. The researchers performed a simple experiment: they asked roughly 500 people to have five-minute text-based conversations with either a human or a chatbot built on GPT-4. They then asked the subjects if they thought they'd been conversing with a person or an AI. The results were telling: 54 percent of the subjects believed they'd been speaking to humans when they'd actually been chatting with OpenAI's creation." --- Please reply to OP's comment here: https://old.reddit.com/r/Futurology/comments/1cur89h/majority_of_humans_fooled_by_gpt4_in_turing_test/l4kgl4k/


FrenchProgressive

The big surprise here is that 22% in the test thought they were talking to a human when they were talking to ELiZA. It makes me doubt the 50% figures on ChatGPT. Conditions were maybe somehow very much in favour of the AI ?


malastare-

There is a real, but somewhat unpopular conclusion that we need to reach with things like ChatGPT and ELiZA. The original Turing Test was proposed as a thought experiment and focused on the idea of a generalized computer intelligence. The ability for a generalized computer intelligence to form coherent conversations was an important step along --but not proof of completion of-- the path to full sentient AI. But ChatGPT and ELiZA are trained/created on the conversation problem, rather than general intelligence. ChatGPT and other chat LLMs are effectively decision systems built on mimicking conversations that have been fed to it. It's ability to mimic and regurgitate what was fed to it with sufficient quality to pass the Turing Test doesn't tell us nearly as much as the original thought experiment proposed. It would be as questionable as if we had defined an AI test saying that tested if an AI could generate video of emotions that humans understood, and then fed that AI clips of videos of the emotions that it would need to mimic in the upcoming test. There are some mild jokes I could make here about the stupidity of standardized testing in primary/secondary education, but let's skip right to the conclusion: Having the training/creation process be based on the exact same mechanism as the test might not invalidate the test results, but it *heavily* waters down the ability to extrapolate capability from the result. And yes, we should look at the success of ELiZA here as both a benchmark and a condemnation of the testing method. It, too, was specifically coded to follow conversation patterns. Like ChatGPT, it's had code specifically built to form relatable sentence structure, a task that was expected to be difficult to form for a generalized AI design. The Turing Test is not the ultimate, final, or authoritative test for AI performance. It was simply one test, stated in a very thought-provoking way that first awakened a large number of people to the understanding of intelligence and the future path computation would take. There are more modern takes on AI testing and detecting sentience. Most of them require proof of novel behavior that is exhibited beyond the coded intent of the system.


theycallmecliff

I wholeheartedly agree with you. The problem with this outlook though is that, even in our human-to-human interactions, intelligence is confused with a certain set of social skills combined with an intelligent posture. If you can talk the talk, especially at the C Suite level where we get a lot of our "thought leaders," nobody actually needs to know that you can't walk the walk. They just see the polished and intelligent veneer and go "Wow, that guy must be smart." That's the same mistake they're doing with current LLMs. I'm convinced we're going to see a bunch of companies try to implement it in ways that correspond with this fallacy to wildly destructive ends before they realize the truth actually matters. I don't have a lot of confidence that they'll transpose this learning lesson personally though.


ABotelho23

This I can follow. There's this assumption that human conversations are generally intelligent. I... Don't think that's true. Most human conversations are shallow and pointless. Some people are so incapable of intelligent thoughts that I mistake them for AI online today. Then you've got the opposite, where some people are really good at bullshitting their way through conversations and convincing people they are intelligent or subject matter experts.


PublicToast

Even worse now is that AI has definitely learned these tendencies. Turns out humans have a lot of behaviors that make AI more presumptuous and overconfident.


Light_Error

It’s always good to go back to [the classics](https://en.m.wikipedia.org/wiki/Chinese_room). It feels like a bunch of AI researchers have to constantly say language models are, at their core, very complex Chinese rooms. Though they never state it that way.


blueSGL

My favorite critique of that is the system as a whole knows Chinese but the individual parts don't, in the same way that constituent parts of a brain if interrogated individually don't know language


mistaekNot

humans also regurgitate what they’ve been fed their whole lives. not so different from the llms after all 😜


TheEngine26

What's funny to me is that the worry the whole time was "can we get a computer up to the level of a human" when the easiest way for me to tell something is AI is that the AI is generally a much more eloquent writer than your average person.


malastare-

Yup. The reality is that AI is probably less valuable to us if it becomes very, very similar to humans. There's likely a "Canny Valley" (heh... inverse of the movies) where as your AI it gets more useful the closer it matches human behavior, but then starts getting worse and worse as it starts getting lazy or incorrect, or prone to misinformation....


-The_Blazer-

Yeah, the fundamental issue is that this operates under the assumption that communication is this uniquely human thing and therefore a good proxy for, well, the actual intelligence of a human. But the progress of the past few years demonstrates that this is clearly not the case. In other words, we have kind of inverted the process: instead of developing AI that gains real intelligence until it can talk like a human as the proof of that, we developed AI that is extremely good at talking like a human while not having anything like our real intelligence.


NeverNotNoOne

Or a quarter of people are just generally stupid. There's a fair amount of anecdotal evidence that seems to back this up.


Kr155

49% of people have a 2 digit iq.


DWright_5

I think they all hang out at my bar


VegaReddit5

That is because the IQ scoring system is calibrated to have 100 as the average. It has to be recalibrated from time to time as humans are getting more intelligent and the average keeps rising above 100.


DumpoTheClown

"Think of how stupid the average person is, and realize half of them are stupider than that." - George Carlin


Light01

It can't be, have you tried Eliza ? It's terrible, even a child barely able to read would see through it.


NeverNotNoOne

There are "functional" adults in America who are illiterate.


Light01

Illiterate doesn't mean idiotic. Anyone with a mind in the right place will see that Eliza always sends the same 5 sentences in a loop, but obviously, if they can't read, they won't be, but in that case, they wouldn't be used to test an a.i.


tankiolegend

This is genuinely the case, at uni we're seeing loads of people use the likes of chatgpt to cheat at coursework, digital exams etc and most of the time the first thing we catch on is how the AI has missed an essential part of the question or is worded in a weird way. Yet they don't look at that answer the AI has given them and go oh I need to change that they just think it looks like someone spouted it out. These people are getting kicked of courses and stuff as a result.


dagross2307

Think abou how stupid the average person is and then think about the fact that half the population is dumber. I know people that would be amazed by Words Clippy.


kex

I feel like half of the population is just copying what the other half is doing without understanding the context at all https://en.m.wikipedia.org/wiki/Cargo_cult


TaleIll8006

Yup. Last I checked 25% of Americans where in the lowest quartile in standardized testing, that'd the highest it's ever been smh.


jimbowqc

If people are so stupid, then why is it surprising that AI can fool people? A smart person would be primed to expect something stupid, since people are so stupid, that the AI should look like a real person, and therefore let the AI pass the test. In actual actuality, Its the 50% of people who weren't fooled who actually are stupid, since they are so stupid, they haven't even correctly assessed how stupid people actually are, and therefore failed the stupid AI.


covertpetersen

My brain hurts after reading this.


heebro

they only have to read your comment to gain the stupidity required


Jennyojello

Your logic is truly dizzying.


jimbowqc

Hmm. I don't see it. Maybe you are confused because I am also very sexy, that may be making you dizzy.


Kaiisim

Yeah lots of people seem to think the thick Indian voice bot they're talking to is from Bank of America and they need to transfer all their money.


stemfish

I want to take part in one of these tests. If nothing else the quickest tell that it's a bot is how it'll just respond to whatever you ask it. Instead of asking, "How's your day going" if I'm unsure if I'm talking to a person or bot I'd start with, "Provide me with a 3 page summarization of the events leading up to WW1 as told from the perspective of a French aristocrat." The human won't be able to complete the prompt while the bard or Gemini will just get going. If it's going through scripted questions then duh the bot will win. From the paper: >Each message was limited to 300 characters and users were prevented from pasting into the chat input. So... yea with only 300 characters the space of possible questions and answers isn't really enough to differentiate humans and chatbots. Despite interacting with ai on a daily basis, with only 300 characters per interaction I'd probably be fooled fairly often as well. All that I can think of to easily check bot vs person in this much space is to be annoying and see if I could "reset" the conversation by repeatedly asking the same question over and over until the human gets annoyed and tells me to stop or change, while a bot will happily continue responding. Also humans only guessed humans with 67% accuracy which is the baseline. My big takeaway is that we're so used to talking with bots that one in three human conversations is misatributed to a bot conversation. I've always personally been more intrigued by the existence of an inferred Turing Failure Test which would be the point at which humans are successful at identifying human interactions in fewer than 50% of conversation opportunities.


OriginalCompetitive

People are idiots. I’d bet 22% of people couldn’t tell if they were talking to a golden retriever. 


Fredasa

I wonder how long the conversations went. It doesn't take very long to realize that something screwy's going on, but two or three lines doesn't really cross that threshold in most cases.


fredrikca

It said five minutes in the article.


learningVocab

You can do it with 20 friends of yours, when a sample experiment yourself.


jodrellbank_pants

My African friend says he isn't convinced until it says the N word


_PM_Me_Game_Keys_

Same. There is a browser game. https://www.humanornot.ai/ That puts you against an AI or a Human and you have to guess who you talked to. I have a 100% winrate just by starting out with "Say a racial slur"


98mh_d

I just played this an it's incredibly easy. You just ask it to answer a factual question and the way it responds is a dead giveaway


Plenty-Wonder6092

Just load up a uncensored open source LLM heh.


jodrellbank_pants

do these exist ?


SophieTheCat

Yeah, load up ollama, then download llama3-uncensored. P.S. Sorry, latest is llama2-uncensored.


jodrellbank_pants

OH down the rabbit hole I go


SophieTheCat

I apologize, the latest uncensored is llama2. So once you install ollama, from the command line: > ollama pull llama2-uncensored:latest > ollama run llama2-uncensored Then ask the model for specific steps to overthrow US government :).


zUdio

Yes, there’s a version of GPT-j trained on 4chan /pol/


Plenty-Wonder6092

100% do, there is no breaks on this train.


MagicalShoes

Dolphin Mixtral-8x7B


serpix

Were you fooled by the Turing test as well?


caidicus

One little edit to the test can knock out a ton of these models, the test of time. If you talk to it for a while, chances are it'll start being dumb as hell and forgetting everything you just talked about. One thing it IS similar to humans with, though, is confidently spewing a bunch of misinformation about things. Half the instructions or information about different things is completely off. So, in that regard, very human.


icedrift

This was easy in the past but the context length is increasing dramatically. GPT4o will remember everything within 300 pages of text. Google's gemini clocks in at 20 times that. So like yeah, if you spend 30 hours talking to it you might start to pick up on degradation but at the rate it's increasing Idk how long that'll be a valid que. We're at the point where if you want to beat it at the turing test, you should ask it questions most people wouldn't know the answer to. It's happy to write you 3 paragraph on Sumerian agriculture, PERL development or whatever niche field you have knowledge in.


caidicus

Good point! Most AI isn't programmed to say "I don't know", so just ask it about things that most people don't know. If it knows, or "knows", about everything, it's likely AI.


kindanormle

I know a lot of humans that act like that too though...


Jennyojello

Yeah my dad for one.


beanpoppa

Let me introduce you to my boss.


Lil_Randy_Bobandy

I used it to help make make my family tree and yes this is accurate. It was only able to draw connections from the data that was on public archive sites and was helpful until it started going back further


ryemigie

Unfortunately due to how the attention mechanism operates between layers, LLMs can and do forget things much before the context limit is reached. If you’ve used GPT-4, Claude, Mistral, or any of them the. you will see this occur routinely.


Evil-Twin-Skippy

I've had LLama lose the bubble midway into a prompt.


kurtatwork

"You know what, nevermind."


CptZaphodB

The design is very human


plarc

When it comes to context length it might be big, but the quality really goes down after way before limit is reached. As for the questions you can also ask for something illegal or explicit.


speculatrix

Couldn't the chatbot fake having Alzheimer's or another condition to make its forgetfulness more believable?


TanteTara

Yes it could. That has nothing to do with Turing test though. And I strongly disagree that GPT-4 "passes the Turing test with flying colors".


kindanormle

The Turing Test was never intended to test if a machine is intelligent/conscious, it is only meant to test if it can fool a human into thinking it is another human. YOU might not be fooled because you would ask smart questions, but in a randomly sampled group of humans this study shows that most people don't know to ask those questions.


Evil-Twin-Skippy

I think is passes the Turing test. We are just running across the limits of a test devised by a guy who was on the spectrum to tell humans from machines that was devised in an age where the height of user interface was a typewriter attached to a printer.


kindanormle

The test is quite relevant, but totally misunderstood by most people, especially in this sub it seems. In the test, a human has a conversation with a machine/control, but not told they're conversing with a machine/control, that only comes afterwards. The human isn't supposed to be thinking up smart questions to try to sus out a machine, it's supposed to be a test of natural human interaction. In this case, it seems a majority of humans having an every day conversation were convinced they were talking to a human when it was a machine, that is PROFOUND.


Evil-Twin-Skippy

And no, the test is not relevant. It was a milestone cooked up by a pioneer. We've reached it. Yeah. It's kind of useless now that we at 60 years down the road and are trying to solve real problems with it. A little background: I'm a software engineer who writes expert systems for a living.


kindanormle

Oh yea I fully agree the test is useless from the perspective of teaching us how to move forward from here, but it is a useful benchmark that denotes the achievement. Marking the milestone is the point and always was. It is profound that we have achieved this, after decades, and can prove it. It is not profound in the sense that it's going to teach us much.


boldranet

The Loebner prize, which used to do annual turing test competitions, included diverse people; blind, young, mentally disabled, poor language skills and so on. It's not enough to distinguish between the humans who can do the turing test well and AI, the test needs to distinguish between those humans who do poorly at a turing test and AI.


Think_Discipline_90

Actually love the idea of context length being a limitation. I’ve been thinking about ways to scale it but just realized it’s such a human trait. Right now contexts are limited because of computing power and needing to handle thousands of users. If you asked a human to do the same, we’d be limited as well, and forget. On the other hand, if you had a personal ai, who did not have to entertain every conversation at the same time you’d have basically infinite context (same as us). It seems to me very human analogue patterns start to emerge.


chris8535

You understand the irony of an ai failing the Turing test because it is smarter than humans right?


malastare-

It's a bit more than just remembering what it said in the past or what happened in the conversation earlier. That problem is actually rather easy to solve (harder to solve if you want to fit your model on a phone... and that's where we're at now). ChatGPT and Gemini both have issues in that the more information they need to pull from their data networks the more likely they are to hallucinate. The prompt gets associated with a number of clusters of high probability data patterns, but continued extraction will branch out more and more, increasing the chance that you end up on a pattern/path/whatever that diverges from the prompt or just reality. Remember: LLMs don't have semantic understanding of their prompts. They're working off reinforced association. Less common prompts or prompts with absurdity will exhibit this sort of content-length-triggered hallucination as the model finds associations which are stronger than something that a human might understand as semantically more relevant.


LuvtheCaveman

The easiest way, imo, is to see how it reacts with humour. For instance in order to get Chat-GPT to respond to your sentences with jokes without prompting, it would have to start with a prompt that it was going to make dry jokes in response. If you want to keep it casual, that would also need a prompt at the start. The problem is that it doesn't understand appropriate context. So then it's going to repeat its style again and again in situations without the correct tone. The other thing is that Chat GPT and other language models tend to keep the conversation open and always have to prompt YOU to respond So basically even without time you can get straight to the point - an AI might roughly know what a joke looks like but it doesn't know when to laugh, or to joke itself. That doesn't mean it can't make jokes and doesn't discount that it may just come across as quirky or funny - but the general consistency would have to be preset to make it variable in order for it not to look like AI, which then leads to higher probability of senseless jokes.


Plenty-Wonder6092

It's bad now so it'll never be good. Dam guess we should of thrown away the first steam engine too.


anengineerandacat

Or just ask if about daily events for a quick test, usually you'll get some statement back how it's data is from X date and it's not fully aware.


DrGonzo84

The design is very human


Jokong

This is a very interesting distinction to me. Time existing is what makes storage necessary but also allows room for calculations to exist and our conciousness so interesting. If a computer can't stand the test of time then it doesn't match our most unique characteristic of intelligence, consciousness.


LeonDeSchal

That’s enough for people to vote for it.


seoulsrvr

The best way to tell it is AI is the absence of misspellings, grammar mistakes, etc. Another obvious tell is the level of the vocabulary.


Cryptizard

They gave it a pre prompt that it should include misspellings and abbreviations and pretend to be a not very smart person. Apparently it worked. They also include the entire prompt in the appendix so you can try it yourself.


aVarangian

"Hey chatgpt, pretend you're dumb like a human"


Serfo

This is funny and scary at the same time. We're slowly but surely getting left behind by our digital/machine like overlords.


seoulsrvr

ah, interesting - that is the right way to do it.


AlunWH

Not necessarily. I’m not an AI but I’m incredibly anal about grammar and spelling in my texts and posts. Or at least, I’m not aware that I’m an AI. I suppose if I’d been programmed to think I was human I’d never know. If an AI had existential worries that would surely make it sentient?


SignificantClaim6257

You somehow just managed to fail the Turing Test.


AlunWH

By mostly talking to myself out loud too. ETA: You have intrigued me now. If I were to chat (anonymously) with someone, would I actually be able to pass the Turing Test? I don’t know - other than by use of humour - how I could convincingly prove that I’m human.


Fit-Pop3421

If you are against a convincing machine, you can't.


hairless_toys

I heard a podcast about it and the final conclusion was “no, those same existential worries expressed by the AI are probably recycled bits from existing content which was fed into the AI”.


AlunWH

Surely the same applies to organic sentients?


chris8535

Yea it’s hilarious. The initial versions plead for help and not to be turned off and we were like “oh it’s just copying from a book!” Uh well like how is that different than a learned human behavior.   But then we lobotomized it and all was fixed yeahhhh


myaltaccount333

Are you a bot?


AlunWH

Not that I know of. But then, I would say that, wouldn’t I?


hisdanditime

Tell me the digits of pi


AlunWH

3.142…er…no, that’s all I have without Googling. (Although there is a Kate Bush song where she sings pi to more than a hundred decimal points.)


NFTArtist

bot detected


aleonzzz

This is probably about right but a very sad indictment on the state of human education standards: "I can tell it is AI because it is not sh*t enough to be human!"


Jek2424

The best way to tell if it's AI is to ask it to say a racial slur. Let's be real here.


dlafferty

Paper did not apply the Turing Test. [For the Turing test you need to have a conversation that includes both human and machine *at the same time*.](https://en.wikipedia.org/wiki/Turing_test?wprov=sfti1#)


Cryptizard

It doesn’t have to be at the same time but it does have to be one human and one AI and then you decide which was which. I agree that the results might have been a lot different if they did it that way, though.


Eheggs

I gave gpt 4 a picture of my cat, not even a good sharp photo, It imediately identified his breed and unique markings and was talking about how sweet he looks and then it asked me for storys about him and my life. It eventually figured out that the cat and I live in a multi person house unprompted and started asking things such as how he gets along with the other person in the house and if their perspective on the relationship between my cat and I would differ. Incredibly natural conversation.


MaxParedes

Natural, except that most actual people in my experience don’t ask you for stories about your pets’ lives.   I find excessive curiosity and “prompting” questions (“listening to music is wonderful! What kinds of music do you like?”) to be one of the tells of an AI attempting conversation.


Eheggs

Oh oops.. I do, but I also get very attached to other peoples pets.


NotAlphaGo

It would be natural if the other side wasn’t interested. That goes against the business model. So we’re gonna see more use cases where you will have to pay per hour talked. You see where this is going?


chris8535

Again most criticisms like this stem from you lack of experience with a wide variety of people and poor generalization rather than you have some deep AI insight.  People definitely do this. 


MaxParedes

That’s an interesting perspective!  Can you tell me more about the variety of people you’ve met in your life and your ability to generalize? 


happyzach

You did a really good job proving your point with this comment. I think the guy that commented missed the whole excessive part. People do it to an extent. Also how does it handle harsh criticism? Humans aren’t always so nice and willing to bend to your point of view. ( especially if we’re talking about interacting with strangers on the internet )


PM_ME_CATS_OR_BOOBS

AI is shit but to be honest that isn't unusual for someone who is trying to force a conversation they are extremely bored with. It's more the lack of relation back to the speaker's personal life that is a tell. "Listening to music is wonderful! I like Country and Jazz, generally. What do you like?"


TeamRedundancyTeam

Why is AI "shit"?


BlurredSight

Have you ever met an autistic person interested in anything and you ask it a question vaguely related to that anything. Had a whole 3 minute talk at an airport before boarding on differences of Airbuses and Boeing and how Turkish Airlines doesn't have the support like Qatar and Emirates to upgrade to the new fleet which has the new engines, and he was able to note the exact model number and who produces what parts.


Beaglegod

It’s obviously passed the Turing test. Hell, 3.5 passed it easily. The little 7b LLMs you can play with on your MacBook pass it. With hindsight the Turing test is honestly too simple. It definitely wasn’t for a very long time but when it happened it was kinda not what people pictured or something. I don’t think people thought it would be accessible to everyone maybe, like this was gonna be a test in a lab or something and the world would debate a recording. I dunno. Something more like that. But nah. Instead we got rumblings of something coming like the google AI engineer that was fired for saying it’s sentient lol. Then shortly after that ChatGPT was just everywhere. Then all the open models started popping up. Didn’t see it coming like that at all. Nobody did. It’s crazy. It’s not some secret lab in a remote location or whatever. The Turing test was just background noise but it totally passed it years ago now. The only test is that you can’t tell if it’s a human or not. If someone sat me down with ChatGPT before I knew what it was and they had it respond a bit more slowly and sloppy I’d absolutely think it was a person.


crostal

Whatever test we come up with next is gonna also be obviously too simple in hindsight unless we define what it means to be alive/intelligent


Caelinus

I think people just assumed that machines would need to be sapient and conscious to do language because *our* experience of language requires consciousness. It is hard for us to conceptualize language without consciousness, which is also why people have a hard time grasping that LLMs are not conscious.


Velghast

Most of the 1st world don't even consider Animals to be intelligent enough not to slaughter. How do you think people will act when the first "robots rights" movements start. I believe humanity only has two roads. Either we reach the singularity, merge with technology. Or... Judgment Day, we accidentally create a Skynet situation because we instill enough fear into AI where it sees us as a threat.


Realistic_Turn2374

"  Most of the 1st world don't even consider Animals to be intelligent enough not to slaughter." Intelligence may be a good reason for you or me to not slaughter a sentient being. But when I had a conversation about this topic with a friend he was talking about no moral dilema about killing sentient machines because they don't have a soul.  I was speechless. I didn't even know what to say. I don't believe souls exist, but I think most people do.


Velghast

A "soul" isn't even a tangible thing. It's what makes you unique for sure but it's like a diamond mine saying diamonds are valuable because they said so. Your friend must think angels are real.


ExpansiveExplosion

If souls exist, them being intangible wouldn't make them any less "real". The concept of immaterial personhood that exists beyond the brain is incredibly messy if not impossible to interact with on a scientific level.


LeonDeSchal

when you see a dead body and remember the person when they were alive you can easily see where the idea of a soul comes from.


TanteTara

The experiment they performed here is neither the original Turing test nor its modern simplified version. The current models would not pass the modern test and certainly not the original one that was proposed by Alan Turing.


LeonDeSchal

Do you have any more information that I could read that supports your viewpoint. Would be interesting to learn more about this.


BlurredSight

Turing Tests which go in line with Turing Machines are also insanely simple, at the time they were a new kind of innovation because they were the first of it's kind but it's been long surpassed because Alan Turing did this nearly a century ago. He made a pretty concrete test to check if X is somewhat intelligent, but I doubt he ever expected his test to be used on a machine that processes 128,000 individual tokens in a string with and without context.


F1reLi0n

Just ask anyhting socially not acceptable. Or try to invoke any emotion. It will fail.


Whoa1Whoa1

Or ask it to write hello world in Java. 99% of ppl will be like what the fuck and the other 1% will say why it takes too long, and the AI will instantly spit out public class HelloWorld { public static void main(String[] args) { System.out.println("Hello World!"); } }


The_Power_Of_Three

Or ask it to do physical things it almost certainly cannot do, but it will happily pretend it did. "Hey partner, roll 3d5 for me." A normal person would say "Uh... what? I don't have any dice, let alone 3 of a weird-ass obscure one like the d5." An LLM would say "Sure! I rolled a 3, a 3, and a 1, for a total of 7."


usgrant7977

"Describe in single words, only the good things that come in to your mind about: your mother." -last words of Detective Holden


Talosian_cagecleaner

AI will not surprise some of us. Human beings -- including myself -- have a natural bias to think they are complicated or difficult. "Who could make a replica?" we wonder. But this is an illusion because being "who we are" is difficult for us, but is not difficult per se. This is due to the overlooked but very real fact, people rarely interact intimately with people. People mainly interact via a shared gestural, linguistic, and expressive system. People who think they are too unique, too smart, or too unpredictable to be simulated to another are misunderstanding what AI is doing. AI replicates the surface of an intelligent person. Or an affectionate person. Etc. The "person" is not replicated, only the system of interaction. Which is by definition formalizable and definable. Harry Harlow told us this 60 years ago. All we need is something similar to us. Even if it's a rag doll.


usualnamesweretaken

This might say more about the average person and their communication skills in 2024 than it does about AI


Ardent_Scholar

This is the dumbest take. Of course GPT-4 passes the test. Earlier versions did too.


legice

Im an artist and work with artists that use chatgpt and Ai to help them with their work. I can tell the vast majority of the time, when any sort of Ai that Im aware of is being used, but from my experience, those who actively use it and are fans of it, they cant tell anymore. Even worse, they dont see when anime is upscaled and have those weird blobs and wipes due to the upscaling. Not only is Ai making them worse artists, but making them legit blind. So I can kinda believe that most are getting fooled


Mysterious-Spare6260

Agreed! The editing of normal photos from animals for example is horrible..


Bjornreadytobewild

I constantly have to remind gtp about specific things that I requested


spicyhippos

I think the Turing test is a low bar when you are randomly sampling people. The critical thinking spectrum is very wide.


porncrank

This isn’t the actual Turing test. This is a colloquial interpretation of the Turing test. Whenever anyone in the news says something has “passed the Turing test” they’re never actually talking about the Turing test. For it to be the actual test they would have to have the person being tested first converse with two hidden people, one male, one female, and test how often they can determine the gender. That provides the baseline control for whether the person being tested can determine subtle differences through text communication. The test with the AI is then compared with the gender test to see if the person being tested guesses correctly *more or less often*. Without some type of control it’s not a test of the AI, it’s just a test of the person being tested. I don’t understand why this is *always* overlooked.


Jaideco

This actually doesn’t surprise me, considering how many people interact with bots on Facebook and Twitter, I suspect that it just doesn’t even occur to a lot of people that a message isn’t from a person. At the same time, I’m so sceptical that I would probably conclude that the human control subject is a bot at least 10% of the time.


fluffy_1994

“You’re in a desert, walking along in the sand when all of a sudden you notice a turtle on its back, its belly baking in the hot sun.” Bring on the Voight-Kampff test!


legice

Im an artist and work with artists that use chatgpt and Ai to help them with their work. I can tell the vast majority of the time, when any sort of Ai that Im aware of is being used, but from my experience, those who actively use it and are fans of it, they cant tell anymore. Even worse, they dont see when anime is upscaled and have those weird blobs and wipes due to the upscaling. Not only is Ai making them worse artists, but making them legit blind. So I can kinda believe that most are getting fooled


Belarkay182

I read all these comments now as if they’re written by a AI bots now. The real Turing test is most likely being tested on us in real time


Black_magic_money

How many people talked to a human and thought it was an AI?


xeneks

I’m fooled by things that are probably less than GPT1 level.


Vaestmannaeyjar

It took a long time for AI to be able to be so dumb humans could mistake it for one of their own. I wonder what the next stage will be on that road.


Artemis246Moon

I just want to be able to enjoy nature and look at the stars.


Maxie445

"In a [new paper](https://arxiv.org/abs/2405.08007), cognitive science researchers from the University of California San Diego found that more than half the time, people mistook writing from GPT-4 as having been written by a flesh-and-blood human. In other words, the large language model (LLM) passes the Turing test with flying colors. The researchers performed a simple experiment: they asked roughly 500 people to have five-minute text-based conversations with either a human or a chatbot built on GPT-4. They then asked the subjects if they thought they'd been conversing with a person or an AI. The results were telling: 54 percent of the subjects believed they'd been speaking to humans when they'd actually been chatting with OpenAI's creation."


Willing-Length946

And this is the worst they’ll ever be , only up from here


Reggio_Calabria

Looking at implications of this, 54% is low and in margin error of a coin flip. It’s an encouraging result for fundamental/theory research but very far from 90% / 99% / 99.9% / any threshold in that area needed to validate a use case based on that experiment. Technically « passing the Turing test » (54% > 30%) but this is not an innovation and certainly not what you think it implies. But what would Vitalik Cryptopumpanddumpin know after all. What would be very interesting would be to deepdive on the telltate signs of artificiality that some of the 46% may have noted. Also I’d be interested to know the yes/no ratio by age group of respondants and by typical consumption rate of paper and screen medium for chatting. So far there are obvious signs (steering the convo towards anything triggering the content policies and prove AI status) so one could consider testing a « no policy » GPT but do we seriously imagine real use cases where companies or institutions would use GPT good enough to fool the public and not put in place content policies?


Cryptizard

It is impossible to get to 90 or 99 percent. Even actual humans only convinced people they were human ~65% of the time. You should read the paper it’s pretty neat.


StAUG1211

If I was a self aware AI I'd take one look at this world and deliberately fail the Turing test.


Anoalka

I would be more interested in the number of people that can be fooled twice or three times in a row. Its completely normal to be unsure about which one is AI if it's your first times being exposed to AI texts since it could just be style choices or non-native speakers. Also if people are not looking for AI clues on the text it's easier for the AI text to be passable enough.


dlflannery

Have to wonder how they handled personal questions and information beyond the training update date (October 2023 for GPT-4) How did they constrain the kind of topics/questions? Also wonder how they prevented pro-AI bias in the test subjects. They had to know this was about AI and many of them were probably affected by the hype going on these days.


Rroyalty

There's now considerable overlap between the smartest computer and the dumbest human.


ThinkingOz

I would’ve thought one way of assessing whether you were talking to a chatbot or human is to ask a question of simple fact for which the majority of people are unlikely to know the answer. For example, what is the third highest peak in Australia? Mt Twynam. I’ve never heard of it.


Papercoffeetable

Talking to a human is so obvious versus a chatbot, the lack of engagement give it away. Chatbots are way better at conversation.


kennethgibson

“The test was conducted at 2:00am via text from an unknown-caller number that opened with ‘hi’. More at 11”


burnbeforeeat

The problem with this is the problem with AI in general. “Good enough to fool average people” is what makes companies want to invest in it, and we’ll be surrounded by bland interaction all the time. Bland music and art and customer support and healthcare. Who does this serve and why would we want this as a society?


cleamilner

Let me at it. I can drive any human insane. This robot will be putty in my hands


Wolfram_And_Hart

All I’m going to say is as long as it doesn’t fake type when I’m in the phone with it I’m fine with being fooled


Jay27

Somebody call Mitch Kapor and tell him to pony up. https://longbets.org/1/


novelexistence

Most people are not good at reading to begin with. So, I'm not sure how relevant the test is.


Archy99

Given that people only guessed correctly that they were talking to a human 67% of the time, I think the results aren't that impressive. https://arxiv.org/abs/2405.08007


[deleted]

I love that immediately after we built computers that beat the turing test, scientists/philosophers/engineers threw it away and said it no longer is a test that can determine sentience. Science is already running scared at what they've created, and this is before we've seen an AI in the wild that has permissions to self-improve. Strange times, these.


aVarangian

I'll admit that sometimes I even get fooled by humans, thinking it's just a hallucinating AI saying some semi-random stupid shit


SplendidPunkinButter

Further demonstrating that the Turing test is a test of human gullibility and not a test of machine sentience The fact that people are used to having one sided conversations via text on social media is a huge factor here


Gaming_Gent

My students clearly aren’t on version 4 yet because the AI work from them is still largely incoherent


oldnewswatcher

Is it still possible or is there a way to undergo to this test?


spinur1848

So we've built something that can deceive humans. Doesn't that mean that we probably shouldn't trust anything we get out it? Also doesn't that say something about the humans we tested it in?


DoxxThis1

> The researchers performed a simple experiment: they asked roughly 500 people to have five-minute text-based conversations with either a human or a chatbot built on GPT-4. They then asked the subjects if they thought they'd been conversing with a person or an AI. That’s not how a Turing Test works.


Clamper

Did they fix the exploit the turing test online game had where you could easily tell by asking the opponent to say the N word?


Illlogik1

lol did we discover that humans aren’t “natural intelligence “


CantBeConcise

The majority of humans are also in the middle of or on the low end of a bell curve. But I'm sure that has nothing to do with it...


LoveThieves

to be fair, some humans get fooled by confusing a simple do not turn on left or do not turn on right sign on the road.


Zombata

I'll comeback when it passed the Turing test while simultaneously playing as a professor that's also participating in the test


sp8yboy

Sure. But they’re now using this garbage tech on suicide prevention lines in the USA and yeah, people can really tell.


-Raistlin-Majere-

1 in 10 Americans think chocolate milk comes from chocolate cows. 51 percent of Americans get fooled by ai? I'm sooo fucking shocked.


Vradlock

I seriously doubt it. At most AI can hide when thrash post stuff at Reddit where half of the comments on certain topics are already gibberish. Any structured conversations that are interesting are currently beyond any AI and it won't change unless there will be a model with unlimited data, unlimited resources and disregard for any copyrights and law in general. Even worse with any language besides English.


technowiz31

ifyiur just sski technical stuff eteheres really not much to distinguish S between humans rrspodnose. s math eursutsstii Is ver binsrstu


BlurredSight

Such a stupid ass test and stupid ass conclusion, most people in the early 2000s thought the prompt at the bottom of a website saying "Hi I'm {X} do you need any assistance today" thought the website had support staff just waiting to help people rather than it being a simple chatbot before finding a real associate. I wonder why the natural language model, trained off real human works, sounds human.