I am learning web programming with Python, and one of the exercises I am working on is the following: I am writing a Python program to query the website "orbitz.com" and return the lowest airfare. The departure and arrival cities and dates are used to construct the URL.
I am doing this using the urlopen command, as follows:
(search_str contains the URL)
from lxml.html import parse
from urllib2 import urlopen
parsed = parse(urlopen(search_str))
doc = parsed.getroot()
links = doc.findall('.//a')
the_link = (links[j].text_content()).strip()
The idea is to retrieve all the links from the query results and search for strings such as "Delta", "United" etc, and read off the dollar amount next to the links.
It worked successfully until today - It looks like orbitz.com has changed their output page. Now, when you enter the travel details on the orbitz.com website, there appears a page showing a wheel saying "looking up itineraries" or something to that effect. This is just a filler page and contains no real information. After a few seconds, the real results page is displayed. Unfortunately, the Python code return the links for the filler page each time, and I never obtain the real results.
How can I get around this? I am a relative beginner to web programming, so any help is greatly appreciated.