Fail to submit webform with urlopen

Question

I'm a total newbie on scraping but I have started on a small project using Python 3.4 For some reason the following code does not submit properly. In my first attempt I basically only want to hit "searh"("Sök") on a webform.

The code I have used is:

import urllib.parse
import urllib.request

url = 'http://www.kkv.se/Diariet/default.asp?nav=2'
values = {  'action' : 'S%F6k',
        'dossnr_from' : '0',
        'dossnr_tom' :  '0',
        'hits_page' :   '10',
        'hits_search' : '50',
        'sort' :  'Regdatum',
        'sortorder' : 'Fallande'}

data = urllib.parse.urlencode(values)
print(values)
data = data.encode('utf-8') 
req = urllib.request.Request(url, data)
response = urllib.request.urlopen(req)
the_page = response.read()
print(the_page)

I also tried submitting the post results (that I find in Firebug after manually posting):

url_values = 'diarienr=&diaryyear=&text_arendemening=&text_avsandare=&regdatum_from=&
regdatum_tom=&beslutsdatum_from=&beslutsdatum_tom=&dossnr_from=0&dossnr_tom=0&
hits_page=10&sort=Regdatum&hits_search=50&sortorder=Fallande&action=S%F6k'

url = 'http://www.kkv.se/Diariet/default.asp?nav=2'
full_url = url + '?' + url_values
data = urllib.request.urlopen(full_url)
print(data.read())

But both codes only spit out the source of the starting url. Can anyone please help me to point me in the correct direction?

Thank you very much for your help. Equilib

score 0 · Answer 1 · answered May 01 '14 at 15:30

0

You should remove the ?nav=2 from the URL you're posting to.

answered May 01 '14 at 15:30

Daniel Roseman

588,541
66
880
895

If I do so I'm no longer directed to the webform in question. – Equilib May 01 '14 at 15:37
Any ideas on how I can proceed? The webform is only accesible when "?nav=2" is added. – Equilib May 02 '14 at 23:32

score 0 · Answer 2 · answered May 01 '14 at 15:35

0

Notice that in your second attempt the URL already includes a '?' and the query string starts with nav=2:

url = 'http://www.kkv.se/Diariet/default.asp?nav=2'

You then construct a full URL and include a redundant '?' after the base URL. That '?' should be an '&', since by the time the base URL is over, the query string has already begun.

answered May 01 '14 at 15:35

Ryan

3,555
1
22
36

Thanks for the ide Ryan. Unfortunately changing the second "?" to "&" unfortunately changes nothing. I'm still stuck in the first page. – Equilib May 02 '14 at 23:30

Fail to submit webform with urlopen

2 Answers2