So…yet another post that has absolutely nothing to do with being lazy! Lately, I have been pondering on my humanity. ‘What do you mean?’, you might ask. Well, when registering/signing up for something on the internet, you are often faced with this:
The good old test to determine whether you are human or spam mail. It might just be me, but I have found that such words are getting increasingly difficult to identify – even for a human being, such as myself.
Upon failing the ‘test’ on the third time around, I vented my frustration on facebook. To my delight this has become quite a source of amusement….. I learnt that (quote, unquote from a friend):
Apparently in determining whether or not you are human you are also helping to convert great works of literature into a digital format. Initially when converting documents an OCR (optical character recognition) application converts all that it can, anything left over is left to humans to convert via capcha codes.
A valid question from my other friend:
I just read this post, and I have a rookie-level question: if our answers for the codes are being used to convert words that OCR can’t interpret into a digital format, how do the websites that want to determine that I’m not a robot know whether or not I am entering the correct answer?
Good question! And the response from my learned friend:
The word on the left is a control word, which both OCR programs agree on. The word on the right is a suspicious word, which the OCR programs disagree on, and which they can’t find in a dictionary. If a person identifies the control word successfully the software assumes that they have also identified the suspicious word correctly, and once a few people have done so that becomes a confirmed word too.
Who said facebook was a waste of time? It’s a pure gem of an educational resource!
If you want to read more on the topic (and also photo credit for the above image): Wikipedia goodness right here.