Monday, August 18, 2008

Captcha with a twist

In the Wikimedia Foundation we use captchas. They are a tool against spam bots and vandals. They do their job and they can be annoying as hell when you do not get it right the first time. As they serve an important purpose I am happy that we use them anyway.

The BBC news webiste reported on a captcha scheme that is different. The idea is so compelling, that I want to seriously argue for the adoption of this scheme in the WMF
.
The idea is that people recognise words that have been scanned but cannot be recognised by optical character recognition software. As we have projects in so many languages, we can provide an even better service, our people can do this same work for other languages then English. I am sure when our Wikipedias become part of such an effort, it would help the digitisation effort for many languages..

The good news is that according to the Recaptcha website, there is already a "plugin" for MediaWiki. So for me it is clear why we should join this great endeavour. It makes filling out captchas useful in more ways then one :) .. PS why not have in the user preferences a place where you can recognise words.. just because you can ??
Thanks,
      GerardM

6 comments:

Anonymous said...

I completely agree.
I've heard about reCaptchas before and they're a really clever project. Leveraging the brainpower that is already being used in normal captchas to actually re-capture lost and poorly scanned books is a very worthy goal with a brilliant technical twist.

As you say, this fits right in to our mission of sharing knowledge.

(ironically, to post a comment here I have to fill out a traditional captcha...)

Milos Rancic said...

Anarchopedia is using that (CMU) captcha for almost a year.

It even allows anonymous edits without captcha, but asks for it only if some external link is added.

BTW, we didn't have any spam link after introducing CMU captcha.

Anonymous said...

This was discussed on the mailing list some. They won't open source it for some reason (Carnegie Mellon University, so I don't see why not, but...) according to Brion.

Simetrical also pointed out that the captcha is reliant on their servers, which may not be able to deal with Wikipedia traffic. The developers would have to build in a backup, and a test for that backup, etc.

They should open source it... :)

Lise Broer said...

Excellent idea. I see the comment verification here doesn't use captcha, though. ;)

pfctdayelise said...

I'm sure I remember Brion saying that if someone creates an open source version, we'd use it.

Anonymous said...

Worst idea ever. reCaptcha is one of the worst CAPTCHA systems, they re-format the page, and if it includes special characters like ¿? or ¡!, reCaptcha makes the page look horrible. Already tested this at editthisinfo and I completely oppose. Cheers, Macy@enwiki.