Ocular Character Recognition

Ever since individual personal computers first came on-line in large numbers, they have been utilised as a huge opt-in distributed computing array by projects such as SETI at Home and Folding at Home.

But there are information processing tasks that can be distributed yet are still impossible to perform with computers. The Stardust at Home project uses the unparalleled image-recognition capabilities of the human brain to process data from an interplanetary sample collection mission. People all around the world take part in their spare time.

Auntie Beeb’s weekly program on the worldwide use of digital tech, Digital Planet, now reports on another innovative scheme to harness the eyes and noggins of computer users.

You know when a web site displays squiggly text against a blurry background and asks you to type in the characters to prove that you are not a spambot? The CAPTCHA project at Carnegie-Mellon uses this to correct the character recognition of scanned old books.

Optical Character Recognition (OCR) is pretty good these days, partly because the algorithms now use dictionaries and word-frequency tables to improve their guesswork. But the technology is still far more error-prone than a human reader. And OCR can apparently identify which words it’s having problems with. So the CAPTCHA project does something very smart.

When you need to pass a spambot test at a participating website, the project’s server feeds you two words from books it’s working on. One is a word it knows. One is a word it’s having trouble identifying. It doesn’t tell you which is which. You type in both words to gain access to the web site in question. The server thus collects a number of interpretations of each tricky word, and when a certain interpretation gets enough human “votes”, it is accepted as correct. Beautiful! People around the net take hundreds of thousands of these tests every day. Instead of asking people to devote spare time to the project, like Stardust at Home, the CAPTCHA project uses brain time that would otherwise just go to waste.

[More blog entries about tech, ocr, scanning, distributedcomputing; teknik, ocr, skanna, distribueradeberÃ¤kningar]

http://reddit.com/button.js?t=2

Author: Martin R

Dr. Martin Rundkvist is a Swedish archaeologist, journal editor, skeptic, atheist, lefty liberal, bookworm, boardgamer, geocacher and father of two. View all posts by Martin R

11 thoughts on “Ocular Character Recognition”

That’s one of the tidiest little bit of technology I’ve read about this week.

LikeLike

Yeah, that’s incredibly neat. But kind of old news; they have been doing that for what, more than a year by now? I’m surprised no major tech news outlets have reported on it earlier. Or have they?

LikeLike

No idea. Who reads major tech news outlets anyway? (-;

LikeLike

OCR-scanning is the greatest invention since sliced bread. Now if they can just create smarter translating programs my life would be bliss.

Increadibly smart this CAPTCHA project. I had heard something vague about it but didn’t realize it simply utilised this common security measure. I had started wondering why real words had started to appear instead of just random letters and thought it was meant to make it easier for users. Well, one kan kill two birds with a stone I guess.

I just hope they don’t get sued for salaries by greedy users.

LikeLike

But what the heck does “nodick” mean in this context?

LikeLike

…maybe it just means it expects the reader to be female? ;-p

Seriously, I did not know that the CAPTCHA tech was actually doing this. That is damn clever.

LikeLike

Dear sir,
I need the full source code for captcha image in asp.net with vb.net 2.0.Please help me.

LikeLike

Sorry, I wouldn’t recognise source code in asp.net with vb.net 2.0 even if it was sung to me by a large boys’ choir.

LikeLike

Hi Martin..
I am trying to recognize the alphabet in captcha using template matching ..but still i couldn’t find.
Could u plz tell me some method to identify the character

LikeLike

I am quite a character myself, you know.

LikeLike

dear sir,

we want this project pls send this project detail’s in my email id.

LikeLike

Thematic Archive

Bee says:

20 August, 2008 at 23:13

That’s one of the tidiest little bit of technology I’ve read about this week.

LikeLike

Niklas Ramsberg says:

21 August, 2008 at 01:57

Yeah, that’s incredibly neat. But kind of old news; they have been doing that for what, more than a year by now? I’m surprised no major tech news outlets have reported on it earlier. Or have they?

LikeLike

Martin R says:

21 August, 2008 at 02:00

No idea. Who reads major tech news outlets anyway? (-;

LikeLike

ArchAsa says:

21 August, 2008 at 03:23

OCR-scanning is the greatest invention since sliced bread. Now if they can just create smarter translating programs my life would be bliss.

Increadibly smart this CAPTCHA project. I had heard something vague about it but didn’t realize it simply utilised this common security measure. I had started wondering why real words had started to appear instead of just random letters and thought it was meant to make it easier for users. Well, one kan kill two birds with a stone I guess.

I just hope they don’t get sued for salaries by greedy users.

LikeLike

Lars L says:

21 August, 2008 at 13:18

But what the heck does “nodick” mean in this context?

LikeLike

Luna_the_cat says:

29 August, 2008 at 14:52

…maybe it just means it expects the reader to be female? ;-p

Seriously, I did not know that the CAPTCHA tech was actually doing this. That is damn clever.

LikeLike

Prity says:

9 February, 2009 at 05:52

Dear sir,
I need the full source code for captcha image in asp.net with vb.net 2.0.Please help me.

LikeLike

Martin R says:

9 February, 2009 at 06:04

Sorry, I wouldn’t recognise source code in asp.net with vb.net 2.0 even if it was sung to me by a large boys’ choir.

LikeLike

Sakthi Gs says:

7 May, 2009 at 05:55

Hi Martin..
I am trying to recognize the alphabet in captcha using template matching ..but still i couldn’t find.
Could u plz tell me some method to identify the character

LikeLike

Martin R says:

7 May, 2009 at 08:49

I am quite a character myself, you know.

LikeLike

chokkamsivaprasad says:

16 October, 2009 at 02:23

dear sir,

we want this project pls send this project detail’s in my email id.

LikeLike

Dela det här:

Related

Author: Martin R

11 thoughts on “Ocular Character Recognition”

Leave a reply to Niklas Ramsberg Cancel reply