Old text used to baffle scammers

News by Dan Raywood

Websites are proving to be resistant to scamming due to a new text system.

Websites are proving to be resistant to scamming due to a new text system.


Named Captchas (Completely Automated Public Turing test to tell Computers and Humans Apart), it has been designed by using text from old books and documents rather than the typical obscured words and characters. The words supplied are those the software cannot read but humans can, helping to complete the conversion of old texts to digital form.


The text is scanned by character reading software so humans can read it but computers cannot. The concept is now being used by websites to stop junk mailings and to harvest addresses and it has been estimated that it is being used around 100 million times a day.


Captchas were created by Luis von Ahn at Carnegie Mellon University in Pittsburgh, where in some documents, where the ink has faded and the paper has yellowed, the character reading software can flag up to 20 per cent of words as indecipherable. These words are then sent out to the sites that have signed up to be Recaptcha partners. These are supplied to sites along with a control word to ensure that the person answering is human.


According to the BBC website, in the past year Recaptcha has helped resolve more than 440 million words and has just helped to complete the conversion of the entire archive of the New York Times from 1908 into digital form.

Find this article useful?

Get more great articles like this in your inbox every lunchtime

Video and interviews