ChatGPT has not won the Turing test

I have watched the development of ChatGPT with more than a little fascination. What’s particularly irksome to me is the apparently widespread notion that ChatGPT has won the Turing test. Typically, that notion arises because some unsuspecting person was caught believing that a piece of text written by ChatGPT was in fact written by a human. I think I am being a stickler for the rules, but the test, as Turing envisioned it, was that a human judge could ask both a (hidden) computer and a (hidden) human any question or questions under the sun, and after some time the judge would proclaim one or the other the “human.” The computer would win if the judge mistakenly declared it to be human. By these rules, ChatGPT is nowhere close to winning the Turing test. A hypothetical judge can easily formulate questions that will cause ChatGPT to go haywire; simply asking it to count the number of words in an unusual sentence will do. Alas, I think I am doing more than picking nits - this question is of great importance.

The difference between winning and losing the Turing test is, that winning the Turing test confers a kind of humanity on the computer. Losing the Turing test leaves the computer as just another tool, albeit a very versatile one. That is the whole ball game, really. Are these things tools, or are they in some way fellow sentient beings?

The big danger with AI is not so much that it will turn all of our toasters into brain-eating zombies, or something of the sort. Rather, it is the danger that we will think it capable of such acts, that we will mistake it for an autonomous entity, and will inadvertantly grant it rights well outside its due.

My view is that AI is a useful tool, but ultimately just a tool, which is marshaled by people and organizations like any other. The main difference between AI and other tools is that it offers a layer of plausible deniability, a veneer of misdirection, much more so than other tools. If a carpenter wields a hammer but bangs someone’s thumb rather than a nail - well, the carpenter can be held accountable for the damage done. The same should be true of people and organizations using AI, and especially LLMs, for one purpose or another.

Such a regulatory regime would I think guide usage and development of ChatGPT and its fellow-traveler generative AIs in a useful direction. It would also help to clarify just where the danger is with trusting these things too much, and help to slow down the pace at which people wield LLMs uncarefully. Ultimately, that would produce a lot more value for us as a species, and would curb the ways in which generative AI can be extractive or harmful.

Update: This article in Nature has a somewhat bait-clicky headline, but it’s a reasonably thorough evaluation of the LLM / Turing test question. The AI21 Labs experiment mentioned in the article strikes me as a too generous to the computer - judges probably not very skilled, length of examination time not very long - but it’s an interesting finding all the same.