0

I am trying to save updated Forex ticker data from this website: http://forex.offers4u.biz/TickDBReadDB.php?p=EURUSD

just hit refresh to update the ticker.

when I use my little python script, it saves the text once, but if i run it again, it makes a new file with the same old data. How can I add a "cachebreaker" so that python can read the new data from the old URL?

import urllib2, time

filename = 'EURUSD ' + str(time.asctime()) + '.txt'

myfile = open(filename, 'w')

page = urllib2.urlopen("http://forex.offers4u.biz/TickDBReadDB.php?p=EURUSD?")

for line in page:
    myfile.write(line)

myfile.close()
page.close()
  • I ran this script and read the output file which reported an error - 'mysql_numrows(): supplied argument is not a valid MySQL result '. – John Keyes Nov 20 '09 at 00:44
  • The webpage isn't changing at the moment... presumably because the market is closed. Is this the issue, or should we retry when the market reopens tomorrow? – Jarret Hardie Nov 20 '09 at 00:56

1 Answers1

0

urllib2 doesn't do any caching. Are you going through a proxy? Or the server may be caching.

Try using a Cache-Control header described here, section 14.9

EDIT: Mind you, the most recent data on that page is from 2009.11.16 20:47:37. Are you sure it's still being actively updated?

Steve Folly
  • 8,327
  • 9
  • 52
  • 63
  • You are right about the data on that webpage--it is not actively updating right now! I've contacted the admin about it, once it works I'll see if my problem still persists. Thanks for the help in the meantime! –  Nov 20 '09 at 02:18
  • I looked at the cache-control section you linked to me. it looks like the "no-cache" or "no store" directives will work. these are HTTP commands, right? how can I use them with python? should i parse the URL and insert a "no-cache" directive in there somwhere? sorry for being so clueless. –  Nov 20 '09 at 05:09
  • Take a look at http://docs.python.org/library/urllib2.html. The last-but-one example at the bottom of the page shows an example with a request header. You'll need to do `req.add_header( 'Cache-Control', 'no-cache')` or similar. – Steve Folly Nov 20 '09 at 06:20