Home > Uncategorized > Breaking Weak CAPTCHA in 26 Lines of Code

Breaking Weak CAPTCHA in 26 Lines of Code

February 23rd, 2010

During one of our latest engagements we found a weak CAPTCHA implementation being used in the target Web application. The assessment was being performed on-site, and after identifying this vulnerability we started to talk with the CSO about how easy it would be to break it.

jxt9e4ya9ko0

The general consensus of course was “very easy”. The problem was that we were unable to find any good CAPTCHA breaking software that average joe could download and run on his computer; so I spent some minutes creating a simple Python script that returns the CAPTCHA solution for this particular implementation.

Before we dig into the script, lets analyze why this CAPTCHA is weak (might not be obvious for some readers):

  1. The letters are not rotated
  2. All letters have the same height
  3. All letters have the exact same color
  4. The letters are not deformed in any way
  5. The background noise color is the same for the whole image

Now, lets see the code that breaks this CAPTCHA:

from PIL import Image

img = Image.open('input.gif')
img = img.convert("RGBA")

pixdata = img.load()

# Clean the background noise, if color != black, then set to white.
for y in xrange(img.size[1]):
    for x in xrange(img.size[0]):
        if pixdata[x, y] != (0, 0, 0, 255):
            pixdata[x, y] = (255, 255, 255, 255)

img.save("input-black.gif", "GIF")

#   Make the image bigger (needed for OCR)
im_orig = Image.open('input-black.gif')
big = im_orig.resize((116, 56), Image.NEAREST)

ext = ".tif"
big.save("input-NEAREST" + ext)

#   Perform OCR using pytesser library
from pytesser import *
image = Image.open('input-NEAREST.tif')
print image_to_string(image)

This simple script works with ~ 90% of the CAPTCHA images created using this specific implementation. Enjoy!

andres.riancho Uncategorized

  1. June 27th, 2010 at 00:53 | #1

    Hi,
    I got error message “unsubscriptable object” on this line:

    if pixdata[x, y] != (0, 0, 0, 255)

    what must I do?

  2. August 6th, 2010 at 14:16 | #2

    You didn’t write pytesser in those 26LoC

  3. October 12th, 2010 at 08:20 | #3

    @d9ping who said he did ?
    “# Perform OCR using pytesser library”

  4. November 3rd, 2010 at 05:49 | #4

    I got error message “unsubscriptable object” on this line:

    if pixdata[x, y] != (0, 0, 0, 255)

    too

  5. November 17th, 2010 at 06:26 | #5

    thank you.i will check it!

  1. No trackbacks yet.