Using captchas to digitize old, damaged books
- August 18th, 2008
- 4 Comments
Those dumb captchas may actually serve a purpose now. A gentleman by the name of Luis von Ahn at Carnegie Mellon University (Biggs went there!) has devised a system whereby software-unreadable words are sent to captcha providers. Since the software is unable to discern what the words are, captchas then step in, and we humans get to identify them when logging into our favorite sites. As people correctly identify the words in the captchas, they’re then sent back the other way, thus completing the digitization of the text in question.
Using this method, people were able to help complete the archive of the 1908 New York Times. Pretty neat, and nice to my freaking out at entering the wrong word over and over again may actually serve some greater purpose.










Brian Fitzpatrick (Who am I?)
3 months ago
Between this and Amazon’s Mechanical Turk project, it seems that mankind is providing more processing power to the Internet every day. :)
plceli (Who am I?)
3 months ago
I don’t get it. How the system is supposed to know that you entered the right thing if the system doesn’t even know what it is supposed to be??
Of course as multiple person enter the same thing a common response emerge but it’s not usable in Captcha context…
Reply
Chuck (Who am I?)
3 months ago
Yeah, seems like it’s an annoyance generator. The server can’t know that you have entered the right letters, so it just stores your result, flags it as wrong and has you try again with a word it does know. We are indeed slaves to the machine.
Reply
Jarett (Who am I?)
3 months ago
It shows one word it know and one it doesn’t. If you get the known word wrong, you fail the check.
Oh, and this has been around for several months now. I expect the BBC News is behind the times, but I’m surprised that a bunch of tech bloggers haven’t seen this yet.
Reply