0

When I run the code below, I get a mechanize._html.ParseError exception. How do I make it shut up? I know it's invalid html, I wouldn't want to parse it if it was a nice website. I did google around, and was told to replace br = mechanize.Browser() with br = mechanize.Browser(factory=mechanize.RobustFactory()), but that didn't work.

import mechanize

#br  = mechanize.Browser()
br = mechanize.Browser(factory=mechanize.RobustFactory())
br.set_handle_robots(False)
br.open("http://journeyplanner.irishrail.ie/bin/query.exe")
for form in br.forms():
        print form
        print
wheybags
  • 627
  • 4
  • 15

1 Answers1

0

Why are you opening a .exe file with mechanize? You're supposed to open web pages using that. If you want to download the .exe file, use br.retrieve() instead.

Edit:

BTW, your code generated this output for me:

<formular POST http://journeyplanner.irishrail.ie/bin/query.exe/dn?ld=1.1&OK#focus application/x-www-form-urlencoded
    <HiddenControl(queryPageDisplayed=yes) (readonly)>
    <HiddenControl(HWAI=JS!ajax=yes) (disabled, readonly)>
    <HiddenControl(HWAI=JS!js=yes) (disabled, readonly)>
    <HiddenControl(outwardConDetails=) (readonly)>
    <ImageControl(start=Verbindung suchen)>
    <TextControl(REQ0JourneyStopsS0A=255)>
    <TextControl(REQ0JourneyStopsS0G=)>
    <HiddenControl(REQ0JourneyStopsS0ID=) (readonly)>
    <TextControl(REQ0JourneyStopsZ0A=255)>
    <TextControl(REQ0JourneyStopsZ0G=)>
    <HiddenControl(REQ0JourneyStopsZ0ID=) (readonly)>
    <RadioControl(journey_mode=[*single, return])>
    <TextControl(REQ0JourneyDate=17/01/2012)>
    <SelectControl(REQ0JourneyTime=[*0, 00, 9, 14, 18])>
    <HiddenControl(REQ0HafasPeriodToSearch=1440) (readonly)>
    <HiddenControl(REQ0HafasPeriodSearch=2) (readonly)>
    <HiddenControl(REQ0HafasSearchForw=1) (readonly)>
    <CheckboxControl(special_search_both=[1])>
    <TextControl(REQ1JourneyDate=)>
    <SelectControl(REQ1JourneyTime=[*0, 00, 9, 14, 18])>
    <HiddenControl(REQ1HafasPeriodToSearch=1440) (readonly)>
    <HiddenControl(REQ1HafasPeriodSearch=2) (readonly)>
    <HiddenControl(REQ1HafasSearchForw=1) (readonly)>
    <SubmitControl(start=Go) (readonly)>
    <SubmitControl(start=Go) (readonly)>>

Edit:

Oh, I was wrong... it's not a .exe file at all. I downloaded it and opened with a text editor, it's nothing but a .html file! It also works for br = mechanize.Browser()

Sufian Latif
  • 13,086
  • 3
  • 33
  • 70
  • I believe it works fine with older versions of beautifulsoup, as (iirc) mechanize uses beautifulsoup as it's html parser. However, this is not an option. – wheybags Jan 17 '12 at 19:23
  • I get Traceback (most recent call last): File "./train.py", line 9, in for form in br.forms(): File "/usr/lib/python2.6/dist-packages/mechanize/_mechanize.py", line 426, in forms return self._factory.forms() File "/usr/lib/python2.6/dist-packages/mechanize/_html.py", line 559, in forms self._forms_factory.forms()) File "/usr/lib/python2.6/dist-packages/mechanize/_html.py", line 228, in forms raise ParseError(exc) mechanize._html.ParseError – wheybags Jan 17 '12 at 19:24