Tux

...making Linux just a little more fun!

I'm utterly gobsmacked

Jimmy O'Regan [joregan at gmail.com]


Sun, 25 Jul 2010 00:34:57 +0100

This is, hands down, the single dumbest bug report I've ever seen: http://code.google.com/p/tesseract-ocr/issues/detail?id=337

I'm kind of reminded of the usability thread, because whenever I see a dumb question on the Tesseract list, or in the issue tracker, it's always a Windows user.

But mostly, I'm just wondering: can anybody think of a valid reason why anyone would want to OCR a CAPTCHA?

-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.


Top    Back


Jimmy O'Regan [joregan at gmail.com]


Sun, 25 Jul 2010 01:13:15 +0100

On 25 July 2010 00:47, David Richardson <dsrich at ieee.org> wrote:

> On 07/24/2010 07:34 PM, Jimmy O'Regan wrote:
>>
>> This is, hands down, the single dumbest bug report I've ever seen:
>> http://code.google.com/p/tesseract-ocr/issues/detail?id=337
>>
>> I'm kind of reminded of the usability thread, because whenever I see a
>> dumb question on the Tesseract list, or in the issue tracker, it's
>> always a Windows user.
>>
>> But mostly, I'm just wondering: can anybody think of a valid reason
>> why anyone would want to OCR a CAPTCHA?
>>
>>
>
> Can you say "spambot"?

I did say 'valid'... My answer was: That's a CAPTCHA, which BY DESIGN cannot be read by OCR.

You've betrayed such a complete misunderstanding of everything you've cared to mention - digits are numbers, not letters - that I wouldn't know where to begin, usually, but as I can't think of a single valid reason why anyone would even want to OCR a CAPTCHA, I guess I don't need to.

In fact, the only people I can think of who would want to OCR CAPTCHAs are spammers, which is what I'm going to consider you to be. Status: Invalid Labels: spam

-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.


Top    Back


Kapil Hari Paranjape [kapil at imsc.res.in]


Sun, 25 Jul 2010 08:14:25 +0530

Hello,

On Sun, 25 Jul 2010, Jimmy O'Regan wrote:

> as I can't think of a single valid reason why anyone would even
> want to OCR a CAPTCHA, I guess I don't need to.

OCR is one of the applications of computer vision.

The computer vision and artificial intelligence guys are extremely interested in figuring out whether CAPTCHA's are truly beyond the capability of "automated" cracking. In some cases CAPTCHA's have been cracked by programmes. For example,

http://www.cs.sfu.ca/~mori/research/gimpy/

Remember that CAPTCHA stands for:

Completely Automated Public Turing test to tell Computers and Humans Apart

So it is an attempt to have a computer generate a Turing Test which another computer cannot solve. If it is indeed possible to do this then AI would have failed!

Regards,

Kapil. --


Top    Back


Dr. Parthasarathy S [drpartha at gmail.com]


Sun, 25 Jul 2010 08:32:34 +0530

My linusability project specifically excludes MS users and MS worshippers, for this very same reason. I am also deliberately avoiding users who want to dual boot Linux with MS-Win. Like trying to use a buffalo to propel a jet engine. . In fact, I come across many more disgusting ones than the person who wants to OCR a CAPTCHA.

partha

On 25/07/2010, Jimmy O'Regan <joregan at gmail.com> wrote:

> This is, hands down, the single dumbest bug report I've ever seen:
> http://code.google.com/p/tesseract-ocr/issues/detail?id=337
>
> I'm kind of reminded of the usability thread, because whenever I see a
> dumb question on the Tesseract list, or in the issue tracker, it's
> always a Windows user.
>
> But mostly, I'm just wondering: can anybody think of a valid reason
> why anyone would want to OCR a CAPTCHA?
>
> --
> <Leftmost> jimregan, that's because deep inside you, you are evil.
> <Leftmost> Also not-so-deep inside you.
>                                              
> TAG mailing list
> TAG at lists.linuxgazette.net
> http://lists.linuxgazette.net/listinfo.cgi/tag-linuxgazette.net
>
-- 
---------------------------------------------------------------------------------------------
Dr. S. Parthasarathy                    |   mailto:drpartha at gmail.com
Algologic Research & Solutions    |
78 Sancharpuri Colony                 |
Bowenpally  P.O                          |   Phone: + 91 - 40 - 2775 1650
Secunderabad 500 011 - INDIA     |
WWW-URL: http://algolog.tripod.com/nupartha.htm
---------------------------------------------------------------------------------------------


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Sat, 24 Jul 2010 23:49:19 -0400

On Sun, Jul 25, 2010 at 12:34:57AM +0100, Jimmy O'Regan wrote:

> This is, hands down, the single dumbest bug report I've ever seen:
> http://code.google.com/p/tesseract-ocr/issues/detail?id=337
> 
> I'm kind of reminded of the usability thread, because whenever I see a
> dumb question on the Tesseract list, or in the issue tracker, it's
> always a Windows user.
> 
> But mostly, I'm just wondering: can anybody think of a valid reason
> why anyone would want to OCR a CAPTCHA?

You said it yourself: spammers. I think that's a 100% lock.

They were probably hoping that you, or whoever else responded to their "bug report", were stupid enough to show them how to make it work (which involves the assumption that it can be made to work.) Looking like an idiot against a chance of learning how to crack CAPTCHAs? Any spammer would take that bet, double-quick.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Sat, 24 Jul 2010 23:54:07 -0400

On Sun, Jul 25, 2010 at 08:14:25AM +0530, Kapil Hari Paranjape wrote:

> 
> Remember that CAPTCHA stands for:
> 
>  Completely Automated Public Turing test to tell Computers and
>  Humans Apart
> 
> So it is an attempt to have a computer generate a Turing Test which
> another computer cannot solve. If it is indeed possible to do this
> then AI would have failed!

...at either one end or the other, inescapably.

[chuckle] "Can God make a rock so big that he himself can't lift it?"

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Sun, 25 Jul 2010 00:02:40 -0400

On Sun, Jul 25, 2010 at 08:32:34AM +0530, Dr. Parthasarathy S wrote:

> My linusability project specifically excludes MS users and MS
> worshippers, for this very same reason. I am also deliberately
> avoiding users who want to dual boot Linux with MS-Win. Like trying to
> use a buffalo to propel a jet engine.

[laugh] You could do it; it would just be a very, very slow-moving jet engine. I suppose you could always use it as a vegetable cart...

It reminds me of the story of a marine railway in South Africa, not all that many years ago, where the shiny multi-million dollar carbon-fiber- and-Kevlar racing boats were hauled out by elephants.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Jimmy O'Regan [joregan at gmail.com]


Sun, 25 Jul 2010 10:10:58 +0100

On 25 July 2010 03:44, Kapil Hari Paranjape <kapil at imsc.res.in> wrote:

> Hello,
>
> On Sun, 25 Jul 2010, Jimmy O'Regan wrote:
>> as I can't think of a single valid reason why anyone would even
>> want to OCR a CAPTCHA, I guess I don't need to.
>
> OCR is one of the applications of computer vision.
>

Yeah, but I think somebody engaged in legitimate computer vision research would be aware that you need more than a single sample to perform training, and, even if they weren't, would have paid enough attention to the training document to have read that.

-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.


Top    Back


Jimmy O'Regan [joregan at gmail.com]


Sun, 25 Jul 2010 10:47:10 +0100

On 25 July 2010 04:49, Ben Okopnik <ben at linuxgazette.net> wrote:

> On Sun, Jul 25, 2010 at 12:34:57AM +0100, Jimmy O'Regan wrote:
>> This is, hands down, the single dumbest bug report I've ever seen:
>> http://code.google.com/p/tesseract-ocr/issues/detail?id=337
>>
>> I'm kind of reminded of the usability thread, because whenever I see a
>> dumb question on the Tesseract list, or in the issue tracker, it's
>> always a Windows user.
>>
>> But mostly, I'm just wondering: can anybody think of a valid reason
>> why anyone would want to OCR a CAPTCHA?
>
> You said it yourself: spammers. I think that's a 100% lock.
>
> They were probably hoping that you, or whoever else responded to their
> "bug report", were stupid enough to show them how to make it work (which
> involves the assumption that it can be made to work.) Looking like an
> idiot against a chance of learning how to crack CAPTCHAs? Any spammer
> would take that bet, double-quick.

Mostly, I've been second-guessing my reaction (it has been brought to my attention quite a lot recently that I can be quite harsh in my responses). I asked on an IRC channel too, they all came to the same conclusion...

<jimregan> I don't want spammers coming to me for free help <jimregan> well <jimregan> one exception <jimregan> assisted suicide

A lot of people use Tesseract for tasks for which it wasn't designed - most of the people who are vocal on the mailing list, in fact - including licence plate recognition, business card recognition, etc.

So it had been at the back of my mind that, at some stage, some spammer would happen along. What particularly disgusted me was that I had been prodding at the particular feature the spammer had requested earlier in the day.

-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.


Top    Back