What's wrong with the logic of this captcha?

Question

Firstly really sorry for explaining the problem not clearly in Title. So Let's begin;

I need this captcha image to be downloaded in programmatically way.

import grab, requests, urllib

root_url = 'https://e-okul.meb.gov.tr/'
g = grab.Grab()
g.go(root_url)
e = g.doc.select('//*[@id="image1"]')
captcha_url = root_url + e.attr('src')
img = urllib.request.urlopen(captcha_url)
localFile = open('captcha.jpg', 'wb')
localFile.write(img.read())
localFile.close()

And the result is this.

When I manually download the image with the very known way Save image as..

There is no problem.

Is there any chance to download this captcha with the way that I actually need?

I looked at both pictures and they seem fine. What is the problem exactly? — RobertB, Nov 02 '15 at 22:34
Let me explain it more simple way; Actually I just need to download the captcha on this site [https://e-okul.meb.gov.tr](https://e-okul.meb.gov.tr) with python. When I'm trying to download the captcha with python it downloads just null captcha like the second link. And I think when you took a look to the first link the captcha hasn't any numbers too. Please first go to [https://e-okul.meb.gov.tr](https://e-okul.meb.gov.tr) and check it again, you'll see the difference. — Metehan Çelenk, Nov 02 '15 at 22:52

score 1 · Accepted Answer · answered Nov 02 '15 at 22:48

1

The captcha image depends on a cookie to populate the value that appears on the image.

You should use the same Grab object you loaded the homepage with to also download the captcha image.

Try this:

import grab, requests, urllib

root_url = 'https://e-okul.meb.gov.tr/'
g = grab.Grab()
g.go(root_url)
e = g.doc.select('//*[@id="image1"]')
captcha_url = root_url + e.attr('src')
resp = g.go(captcha_url)
localFile = open('captcha.jpg', 'wb')
localFile.write(resp.body)
localFile.close()

It generated a file with the correct characters in it for me.

answered Nov 02 '15 at 22:48

drew010

68,777
11
134
162

It worked. As you can guess this is a school management system. And all the formal schools in Turkey using this system. Actually It's kind of CMS not different. And I've project that based on Machine Learning which aims o increase the success rate of students. So I needed some data. And they (e-okul) don't provide any API or service. And I need to solve the captcha to scrap the data. Anyway really thanks about it. – Metehan Çelenk Nov 02 '15 at 23:26

score 0 · Answer 2 · answered Dec 30 '15 at 01:22

More pythonic file writing with:

import grab, requests, urllib

root_url = 'https://e-okul.meb.gov.tr/'
g = grab.Grab()
g.go(root_url)
e = g.doc.select('//*[@id="image1"]')
captcha_url = root_url + e.attr('src')
resp = g.go(captcha_url)

with open('captcha.jpg', 'wb') as localFile
    localFile.write(resp.body)

What's wrong with the logic of this captcha?

2 Answers2