-1

Firstly really sorry for explaining the problem not clearly in Title. So Let's begin;

I need this captcha image to be downloaded in programmatically way.

import grab, requests, urllib

root_url = 'https://e-okul.meb.gov.tr/'
g = grab.Grab()
g.go(root_url)
e = g.doc.select('//*[@id="image1"]')
captcha_url = root_url + e.attr('src')
img = urllib.request.urlopen(captcha_url)
localFile = open('captcha.jpg', 'wb')
localFile.write(img.read())
localFile.close()

And the result is this.

When I manually download the image with the very known way Save image as..

There is no problem.

Is there any chance to download this captcha with the way that I actually need?

Zizouz212
  • 4,908
  • 5
  • 42
  • 66
  • 1
    I looked at both pictures and they seem fine. What is the problem exactly? – RobertB Nov 02 '15 at 22:34
  • Let me explain it more simple way; Actually I just need to download the captcha on this site [https://e-okul.meb.gov.tr](https://e-okul.meb.gov.tr) with python. When I'm trying to download the captcha with python it downloads just null captcha like the second link. And I think when you took a look to the first link the captcha hasn't any numbers too. Please first go to [https://e-okul.meb.gov.tr](https://e-okul.meb.gov.tr) and check it again, you'll see the difference. – Metehan Çelenk Nov 02 '15 at 22:52
  • Why do you *need* to download the captcha? – Meier Nov 02 '15 at 22:55

2 Answers2

1

The captcha image depends on a cookie to populate the value that appears on the image.

You should use the same Grab object you loaded the homepage with to also download the captcha image.

Try this:

import grab, requests, urllib

root_url = 'https://e-okul.meb.gov.tr/'
g = grab.Grab()
g.go(root_url)
e = g.doc.select('//*[@id="image1"]')
captcha_url = root_url + e.attr('src')
resp = g.go(captcha_url)
localFile = open('captcha.jpg', 'wb')
localFile.write(resp.body)
localFile.close()

It generated a file with the correct characters in it for me.

drew010
  • 68,777
  • 11
  • 134
  • 162
  • It worked. As you can guess this is a school management system. And all the formal schools in Turkey using this system. Actually It's kind of CMS not different. And I've project that based on Machine Learning which aims o increase the success rate of students. So I needed some data. And they (e-okul) don't provide any API or service. And I need to solve the captcha to scrap the data. Anyway really thanks about it. – Metehan Çelenk Nov 02 '15 at 23:26
0

More pythonic file writing with:

import grab, requests, urllib

root_url = 'https://e-okul.meb.gov.tr/'
g = grab.Grab()
g.go(root_url)
e = g.doc.select('//*[@id="image1"]')
captcha_url = root_url + e.attr('src')
resp = g.go(captcha_url)

with open('captcha.jpg', 'wb') as localFile
    localFile.write(resp.body)