www.guardian.co.uk: Alan Turing’s legacy: how close are we to ‘thinking’ machines?

I was interviewed for this article:

Alan Turing’s legacy: how close are we to ‘thinking’ machines?

Here’s the full text of the email interview I submitted for this article:

How long did you spend on Chip Vivant?
With the exception of a six-month period in late 2007 / early 2008, it’s been in fits and spurts. Probably a few weeks part time before each competition.

Is it true you “told your clients to leave you alone” to develop Chip?
It is true. That said, I only had one full-time client and one part-time client at the time. My full-time client wasn’t thrilled about this, and the scary part was that in the world of freelance consulting, it isn’t the case that “absence makes the heart grow fonder”. What’s more, this was at the beginning of 2008, which would end up leading to the Great Recession in the latter part of that year. So looking back on it, taking the plunge cost me a lot both in savings and opportunity cost. Sometimes, though, you have to follow your inner voice and do what you have to do.

How did you feel about winning?
It felt wonderful, obviously. Although the first time I entered was in 2008, I had been dreaming about writing a chatbot since 2000 or so and actually had an interest in chatbots since my childhood. The win was all the more meaningful because I won on my own terms.

Like I say here:

http://www.chipvivant.com/2012/05/15/chip-vivant-wins-the-2012-loebner-prize-competition/

“…the thing that pleased me the most is that I won on my terms, without a fake backstory, fake typing errors and a preponderance of canned responses.”

As described further on that page, on the day of the win, I shared this message with my friends at Robitron (the Yahoo groups website where we discuss chatbots) and chatbots.org:

People – I’m still in shock with what just happened here. One of my main criticisms of this contest is that despite Hugh’s vision, the judges’ slavish interpretation of the contest rules favored fake backstories, canned responses and other trickery over real effort.

Well, Chip won:

– without spelling mistakes or fake backspaces to correct fake errors
– despite saying stuff like “I didn’t understand what you just said” and “I can’t deal with that syntactic variant yet – instead of “Jim likes peaches?”, use “Does Jim like peaches?”
– despite his inability to say what his profession is (let alone, mother, father, brother, dog’s name, sister-in-law)

This flies in the face of a lot of long-cherished beliefs people have about this contest, including my own. I don’t know if it’s a fluke, but all of the judges were pretty consistent in how they approached this.

The above doesn’t mean that Chip doesn’t have his fair share of canned responses. I personally abhor canned responses, though. (I hired a contractor to write a bunch of them for me for this year’s contest because I couldn’t bring myself to do this myself.) I mention my reasons why here:

http://www.chipvivant.com/2008/09/05/my-loebner-prize-contest-2008-reflections/

Programming a bot to pretend to be a human involves much more than one line of code where the bot affirms that it’s human – it involves an extremely labor-intensive (and IMO time-wasting) effort to code up a web of lies which invariably implodes under its own weight.

The fact that Chip won means that despite the naysayers, the Loebner Prize Competition is a legitimate tool for attempting to advance the field of AI and not an “an obnoxious and unproductive annual publicity campaign” as Marvin Minsky put it.

One of the judges, whom I spoke with after the competition, summed up the reasons why this win pleased me so much better than I could, and her remarks and the paradigm shift I effected in her still have me floating on air:

Chip was, I think, the only Chatbot that really seemed to engage with me. ‘He’ apologised for not understanding a question. At one point Chip also suggested I might phrase a question differently so it would be more understandable to ‘him’. Chip didn’t try too hard pretending to be human but instead explained that it hoped to learn more so as to be able to answer my questions better in future. Chip made me realise that I really don’t care whether I’m talking to a human or a computer as long as the conversation is in some way rewarding or meaningful to me. Chip realised that conversations are a two-way street. Give and take. I don’t think any of the other finalists quite got that.

What’s next for Chip?
This Loebner prize win despite not making a serious attempt to fool the judges has inspired me to continue working on Chip on my own terms. I want to enhance Chip to learn more about the person he’s talking to and use that to enrich the conversation. Whether Chip will perform in subsequent years as well as he did this year, and whether the judges are receptive to Chip’s unwillingness to try to fool them is another story, but no one can take the thrill of this year’s win away from me.

I also want to eventually integrate Chip into my commercial website, as I’ll discuss shortly.

Are you refining it so that it’ll one day actually fool judges?
I’ve become keenly aware of the futility of creating a program which comes anywhere close to fooling someone who knows what they’re doing. I’ve also become keenly aware of the overwhelming amount of unwritten, undocumented common knowledge which we’ll somehow need to codify if we’re ever to win the Loebner Prize or any other Turing Test. Case in point: during my development of Chip, I came to realization like there’s no non-human knowledge source anywhere that an extraterrestrial landing on Earth could use to find out whether an apple is smaller or larger than the moon. Try Googling for this answer. Try finding this answer in any book or library, anywhere. You’ll come up dry. And yet some laypeople think that a Turing-Test-winning program is just around the corner. The chasm between chatbot writers who are acutely aware of this and laypeople who aren’t is very large.

Most of the current chatbot entries in the Loebner Prize Competition attempt to trick their interrogator by using keyword-spotting tricks and canned responses. This can cause the machine to appear more intelligent than it really is. For example, if I say “I’m worried about my mother.”, a chatbot can simply spot the keyword “mother” in the sentence and spit out the canned response “Tell me more about your family.” Someone who doesn’t know how this is implemented could be fooled into thinking they’ve found a new, understanding soulmate, but if they dug deeper, they’d be sorely disappointed.

If someone tells me that with an arsenal of a gazillion such tricks, a computer succeeds in passing the Turing Test, then that revelation will be about as interesting to me as telling me that I was able to fool a roomful of three-year-olds into believing that I could really pull coins out of their ears. Why? Because in five minutes, I can teach anyone to bust the bot. It’s really simple: ask any chatbot a question that they are unlikely to have a canned response for, yet which most humans would answer the same way. Here’s an example: “Where on your face is your nose?” Most humans, regardless of race, language, culture, gender, age, etc. will answer this question in exactly the same way, yet no chatbot will answer this unless they have a canned response for this particular question in their database. (They could maybe try looking this up in Wikipedia, but the answer is unlikely to be the three-word answer that you and I both have in our heads right now.)

So I can say categorically that a chatbot which fools a savvy interrogator will not happen in my lifetime, and I estimate I’ve got a good 35-40 years left if I don’t get hit by a bus. The fact that four years after I’ve revealed this, you still can’t Google for or find codified knowledge for things like whether an apple is larger or smaller than the moon means that we’ve got light years to go before we crack this code.

That said, I’m equally astonished by the unexploited low-hanging-fruit-type opportunities which abound in this field. In the 1960s, the ELIZA program with its simple keyword-spotting parlor tricks managed to dupe people and give them comfort. Imagine how much more we could accomplish along these lines with the technology at our disposal today.

Might it have a commercial future?
My website empathynow.com, just recently launched, is a baby step in the direction of my commercial foray into applying my “skill” or “talent”, or whatever you want to call it, to help and soothe people. Like I said, I’m not interested in creating a chatbot that fools people, but rather one that can empathize with and provide comfort to people who can’t or don’t want to get it from a real person. Chip Vivant is not integrated into empathynow.com yet, but will be eventually.

Do you think the Turing Test is still relevant today?
I honestly don’t think the Turing Test was ever relevant, partly because I believe the deck is unfairly stacked against the machines. Why? Because we all have sensory inputs which are much different than those of a computer, so if I ask a computer to describe a hunger pang, or how something slimy feels when you swallow it, then the programmers would have to spend a ton of time modeling these things that you and I get “for free”.

Here’s another example – answer the following question for me:

A cricket bat and ball cost one pound and ten pence total. The bat costs one pound more than the ball. How much does the ball cost? (Stop here and read ahead only when you have the answer in your head.)

What’s your answer? If you’re like most people, your answer is probably that the bat costs one pound and the ball costs ten pence. Wrong. (Now go do the math.) So now how in the world would I go about modeling why most people get this answer wrong? And why would I possibly want to waste my time doing this? I could name a gazillion similar examples.

That’s why I don’t think the Turing Test is the best measure of machine intelligence: you can ascribe intelligence to things that don’t involve the huge burden of understanding or mimicking humanness. When I say that the Turing Test was never relevant, I’m saying that the information I’m basing my conclusion on was just as readily available in the 1950s when the Turing Test was formulated.

That said, I staunchly defend Hugh Loebner and the Loebner Prize Competition. As Hugh says here:

http://loebner.net/Prizef/In-response.html

At the current state of the art I suggest that the appropriate orientation for the contest is to determine which of obviously artificial computer entries is the best entry, i.e. most human like, and nominate the authors as “winners.” It should not be to determine if a particular terminal is controlled by a human or a computer. If we maintain this orientation, there should be no problems holding unrestricted tests.

Not all of the judges “get” this, which can be tiresome if they expect to get witty, humanlike behavior, but I still think this contest is one of the best contests of its kind and commend Hugh for persisting with this despite the naysayers.

Why did you decide on the Loebner Prize as your project?
The Loebner Prize is a concrete goal and deadline I had despite my disinterest in creating a chatbot which pretends to be human. As mentioned previously, in the early 2000s, when reading transcripts of previous years’ competition, my mouth started to water and I knew I wanted to be a part of this, on my own terms.

Like it? Share it!Share on FacebookTweet about this on TwitterShare on LinkedInShare on RedditShare on StumbleUponDigg this

Leave a Comment