0

I'm a total newbie on scraping but I have started on a small project using Python 3.4 For some reason the following code does not submit properly. In my first attempt I basically only want to hit "searh"("Sök") on a webform.

The code I have used is:

import urllib.parse
import urllib.request

url = 'http://www.kkv.se/Diariet/default.asp?nav=2'
values = {  'action' : 'S%F6k',
        'dossnr_from' : '0',
        'dossnr_tom' :  '0',
        'hits_page' :   '10',
        'hits_search' : '50',
        'sort' :  'Regdatum',
        'sortorder' : 'Fallande'}

data = urllib.parse.urlencode(values)
print(values)
data = data.encode('utf-8') 
req = urllib.request.Request(url, data)
response = urllib.request.urlopen(req)
the_page = response.read()
print(the_page)

I also tried submitting the post results (that I find in Firebug after manually posting):

url_values = 'diarienr=&diaryyear=&text_arendemening=&text_avsandare=&regdatum_from=&
regdatum_tom=&beslutsdatum_from=&beslutsdatum_tom=&dossnr_from=0&dossnr_tom=0&
hits_page=10&sort=Regdatum&hits_search=50&sortorder=Fallande&action=S%F6k'

url = 'http://www.kkv.se/Diariet/default.asp?nav=2'
full_url = url + '?' + url_values
data = urllib.request.urlopen(full_url)
print(data.read())

But both codes only spit out the source of the starting url. Can anyone please help me to point me in the correct direction?

Thank you very much for your help. Equilib

2 Answers2

0

You should remove the ?nav=2 from the URL you're posting to.

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
0

Notice that in your second attempt the URL already includes a '?' and the query string starts with nav=2:

url = 'http://www.kkv.se/Diariet/default.asp?nav=2'

You then construct a full URL and include a redundant '?' after the base URL. That '?' should be an '&', since by the time the base URL is over, the query string has already begun.

Ryan
  • 3,555
  • 1
  • 22
  • 36
  • Thanks for the ide Ryan. Unfortunately changing the second "?" to "&" unfortunately changes nothing. I'm still stuck in the first page. – Equilib May 02 '14 at 23:30