My Loebner Prize Contest 2009 Reflections

Posted by Mohan Embar on July 7, 2011 Leave a comment (0) Go to comments

Introduction

It’s been over three years since I updated chipvivant.com, so these reflections are a couple of years old. Since Chip is a Loebner Prize 2011 finalist, I figured I better freshen up this site because I anticipate increased traffic. Better late than never, I guess.

When Hugh Loebner first posted the rules of the 2009 Loebner Prize Competition, I was excited because the contest format was reverting back to the previous years’ format which I had trained Chip for. There was the issue of Hugh’s arcane Loebner Prize Protocol which at the time, no one had graciously developed Open Source helper libraries for, but the protocol seemed straightforward and didn’t worry me. For safety’s sake, I decided to invoke the option of bringing my own computer to Hugh’s premises even though Chip ran off of a USB pen drive.

Chip is not continuously online. Nor do I work on him full-time. I had done little development since the 2008 LPC and also didn’t have the benefit of tweaking Chip with numerous conversation logs. Nevertheless, I did have the benefit of the logs from the 2008 LPC (since Chip was running on my own web server) as well as the feedback I got from participating in the LPC 2008 prescreening.

I corrected the most glaring deficiencies, got the Loebner Prize Protocol (hereafter LPP) stuff working and made the appointment to meet Hugh at his place of business in New Jersey. Due to my inexperience with the LPP, I didn’t want to chance mailing a non-functional entry.

When I arrived at Hugh’s place of business in New Jersey, I learned that there were only two other entries that year and that I would automatically be part of the Final Four if the program functioned properly, which it did. Even though I got a free pass that year, I was beside myself with excitement because this was the culmination of a nearly decade-long dream. The program did function properly, of course, and I was in. I was also very pleased at the screening questions, which were of the type I had expected.

The competition was to be held in Brighton, U.K. that year, which seemed both exciting and expensive. Because of my vegan activism, I have friends in countries all over the world, especially in the U.K., which is the birthplace of veganism. A vegan friend of mine graciously offered to put me up in her flat in London and I would commute by train to Brighton.

The nice thing about these competitions is that if your entry makes it to the finals, you have a few months after that to improve it. The bulk of my effort was done in the weeks before the competition, including a couple of all-nighters in London before the contest. I wanted to focus less on cramming facts into Chip and more on giving him a soul. My secret sauce would be the influence of Gary Shannon’s writings, particularly Programming a Chatbot to Understand a Sentence. His proposed implementation was too ambitious for my time constraints, but I wanted to capture the essence of it. This approach will also help when I try to use Chip to help and comfort people via empathynow.com (which is under construction).

The day of the competition finally came and I had literally stayed up the entire night before cramming new code into Chip (unwise, as any programmer would tell you).

The two other contestants were Rollo Carpenter and David Levy. (Why isn’t Levy’s LPC 2009 win mentioned in the Wikipedia article? And when do I get my own Wikipedia article?!) I had immense respect for both of them. Rollo’s Jabberwacky and Cleverbot both use novel approaches to generating responses, and David Levy was my equivalent of a Mega Rock Star. His articles in Creative Computing were main the reason I became interested in Artificial Intelligence. (Inspired by him, I wrote a chess program in Z-80 Assembler for the TRS-80 for a high school project.)

I met Hugh for the second time and also some nice, new people: David Hamill, with whom I had conversed quite a bit on the Robitron list and who I felt shared the same views as I do, Erwin van Lun, head of chatbots.org, another nice guy who I got to speak Dutch with. There were some nice confederates (humans who took the other side of the bet and tried to convince the judge they were human) too. The only one whose name I can remember was Brian Christian who was working on a book and magazine article at the time and interviewed me after the contest was over.

The actual competition was a bloodbath for Chip and me, though, and I chalk that up to lack of experience. Recall from My Loebner Prize Contest 2008 Reflections that my approach was to advertise Chip’s abilities to the judges. This was a basically a gigantic dump of sample sentences that you can see starting at Motivations and Functionality (“My competitors might have cuter canned responses…”). I had used this approach during the 2008 competition and stand behind my reasoning for using it, but it proved fatal in the 2009 competition due to the Loebner Prize Protocol, which involved my inserting a 300 millisecond delay between each character. Combine that with a five-minute conversation limit and you get the horror I felt when Chip shamelessly steered a judge into asking “What can you do?” only to chew up the rest of the conversation attempting to vomit that gigantic list.

That happened for three of the four conversations. For the fourth one where that didn’t happen, Chip got the highest score.

One the one hand, it sucked to go all the way to England to have Chip give the performance he did. On the other hand, you have to stay in the game and not get discouraged with this kind of stuff. Chip would have benefited if I had finished this earlier and exposed him to more real-life conversations, but I didn’t have time for that and I find the task of keeping Chip online and then poring through megabytes of useless logs tiresome (which is why Chip is offline at the moment). That said, I don’t regret going and was happy to meet the people I met – this stuff is not something I can share with people in my immediate surroundings, however graciously they listen to me ramble on about this stuff.

One final closing observation. Reading Brian Christian’s Atlantic article reminds me that as I write this in 2011, chatbot writers and the rest of the world are indeed living in parallel universes. When I read passages from his article like:

Turing’s prediction has not come to pass; however, at the 2008 contest, the top-scoring computer program missed that mark by just a single vote. When I read the news, I realized instantly that the 2009 test in Brighton could be the decisive one. I’d never attended the event, but I felt I had to go – and not just as a spectator, but as part of the human defense. A steely voice had risen up inside me, seemingly out of nowhere: Not on my watch. I determined to become a confederate.

…I scratch my head. Contrast this with my observation in Motivations and Functionality that basically only Chip and maybe one other guy in the world have a bot that knows that the moon is larger than an orange and you understand my puzzlement as to why some people can think that we’re anywhere near close to machines that possess the kind of intelligence that the Turing Test would purportedly reveal. The reason that the 2008 entry was able to come so close is nicely explained here and has nothing to do with an entry that possessed any real intelligence. I can train anyone to spot the bot with fifteen minutes of training: just ask it simple questions like the ones listed in the Motivations and Functionality section. And Chip is not much better – as soon as you deviate slightly from the limited list of things he knows about, he’ll choke too.

That’s the stark reality of the state of affairs today. It may not make for sexy articles and news stories, but it’s the truth. (I’m not saying that Brian and others purposely try to sensationalize this, just that they might be a bit misguided.) That said, there’s plenty of progress to be made if we roll up our sleeves, are honest about things and confront these problems head on. Also, the fact that ELIZA was able to provide comfort to some people in the 60s shows that such a thing is possible despite the machine not possessing any real intelligence. That’s one of the things I hope to eventually accomplish with Chip.

Like it? Share it!