Technology

Captcha if you can

Password authentication tool is now doing double duty to help digitize books.

It has become a daily occurrence for those of us on the Internet — comprehending the square of blurry words that shopping sites and forums often ask you to type out to ensure that you’re a person, not a computer.

The act of filling out these “captchas” is made close to 200 million times per day around the world, a staggering amount for such a tiny job.

So why not put all that effort to some greater benefit? ReCaptcha, a project created at Carnegie Mellon University, is taking the time people invest in typing out those blurry words, and using it to help in the digitization of books and other print media.

Currently, when books, newspapers and other media are filed to online archives, they’re scanned and turned into text using optical character recognition (OCR). Unfortunately, OCR makes a lot of mistakes. Under reCaptcha’s plan, words that OCR can’t make out are sent to reCaptcha, which then gets us to identify the word and prove we’re humans in one quick step. Since captchas reduce incoming website comment and e-mail spam, it’s a win-win for both business and digital archiving.

Many large sites now use reCaptcha. Its current projects include helping to digitize old editions of The New York Times. Consider each one you solve taking us a step closer to a fully digitized library.