Sunday, January 08, 2006

CAPTCHAing Artificial Intelligences

Apparently I am posting often enough to trigger Blogger's CAPTCHA system.

CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart. These tests are becoming more and more important as hordes of automated web robots threaten to disrupt legitimate internet commerce. These robots are programs or scripts that interact with websites far faster than humanly possible. These programs can be used to sign up for millions of free email accounts as a first step in sending spam, to purchase all the tickets to a concert for scalping and to flood web servers with requests in order to prevent access by others in a Denial of Service (DoS) attack. Many virus and worm programs seek to launch these programs on infected machines in order to perform distributed attacks and make detection and tracking more difficult.

Many web sites such as TicketMaster and Blogger require anyone buying a ticket or starting a blog to enter a string of letters or numbers that appear in a distorted fashion in a picture. The text is usually fuzzy and covered by a pattern overlaying the picture. This makes image recognition by a computer program much more difficult, but not impossible. In fact, over the last several years the test has evolved as the analysis algorithms and tools have improved. Early versions of the test did not include the pattern overlay. As computer programmers become more sophisticated and build better tools these types of tests will need to improved dramatically.
Alan Turing proposed a test that has evolved into what some people consider a holy grail of Artificial Intelligence. The Turing Test, as it is now known, was originally posed to see if a referee could distinguish a male from a female entirely by their written answers to his questions. The subjects were physically separated from the referee and only allowed to communicate through an anonymizing media such as typed cards or electronic text. The referee can ask any question he sees fit to try and determine which subject is male and which is female on the basis of their answers. The ref knows that there is one male and one female, and he "wins" if he correctly determines which is which. The male "wins" if he can convince the ref that he is the female. The test can also be reversed such that the female is attempting to convice the ref that she is the male. In one actual test the male gave himself away by naming the sizes of pantyhose as Small, Medium and Large, instead of Queen.

The Turing Test has been applied to computers and Artificial Intelligence as well. Turing theorized that if a computer's answers could not be distinguished from a human's in such a test that it could be considerd to be thinking. He predicted by the year 2000 that programs would exist that could pass the test. The Loebner Prize was established in 1990 with a $100,000 prize for the first program that can pass a version of the test. The full test has never been passed, but each year a $2000 prize is awarded to the year's best contender.

As the state of the art in Artificial Intelligence (which has image recognition as a sub-discipline) improves we can expect these public tests to get more extensive as well. Can you imagine having to explain a joke or draw a picture to order a concert ticket?

Find more info here:
CAPTCHA
Loebner Prize
Kitten CAPTCHA

I wonder if I could write a script to send these off to the Mechanical Turk?

0 Comments:

Post a Comment

<< Home