0

I wrote a small script to fetch some data from a website and store it in a file. The data is fetched in a variable "content".

try:
    content = urllib.urlopen(url).read()
except:
    content = ""

The file has a few short phrases, each on a new line. I intend to simply update the last line of the file every time I run the script. And so I'm using the following code:

try:
    f = open("MYFILENAME","r+")                     # open file for reading and writing    
    lines = f.readlines()
    replace_index = len(lines[-1])              
    f.seek(long(int(f.tell())-replace_index))       # should move to the start of last line
    # content[start:end] has no "\n" for sure.
    f.write(content[start:end] + " " + now + "\n")
except Exception as exc:
    print "This is an exception",exc
finally:
    f.close()

Now, I use the crontab to run this script every minute and update "MYFILENAME". But the script gives weird behavior sometimes,i.e., instead of replacing the last line, it appends a new line to the file. These sometimes are usually associated with me restarting the computer or re-using it after putting it on sleep.

If original file was:

xyz
abc
bla bla bla
1 2 3

I'm expecting the output to be:

xyz
abc
bla bla bla
my_new_small_phrase

Instead, sometimes I get:

xyz
abc
bla bla bla
1 2 3
my_new_small_phrase

What is wrong with the above code? (I'm using both crontabs and seek and tell functions for the first time, so I'm not sure about either of them.) Or is it something to do with the "\n" at the end of write() function?

pymd
  • 4,021
  • 6
  • 26
  • 27
  • Are you using "plain" `cron` or one clone `anacron`, `fcron` and so on? – Sylvain Leroux Jun 20 '13 at 07:39
  • In addition, how do you deal with `\n` in the "content"? – Sylvain Leroux Jun 20 '13 at 07:40
  • I'm using "plain" cron. I've written the "python myscript.py" command in the crontab file. – pymd Jun 20 '13 at 07:40
  • "\n" inside content shouldn't be a problem. I'm actually only writing a slice of content, and I'm sure there is no "\n" inside that slice. – pymd Jun 20 '13 at 07:42
  • 1
    I think you should edit your question to (1) show a typical example of "content" read from your URL (2) the expected output (3) the "weird" output (4) the full significant part of your code. I don't see any "slice" here. – Sylvain Leroux Jun 20 '13 at 07:44
  • thanks! I've edited the question, I hope it makes it clear. – pymd Jun 20 '13 at 07:52
  • 1
    If you read the data every minute from the website could it be that the file changes on server while you read it and you get a "mixed" version? Could it be that the downloaded file sometimes has an additional newline at the end? – Michael Butscher Jun 20 '13 at 08:27
  • @MichaelButscher yes, the file does change frequently sometimes (during matches: I'm simply fetching the current score into a file), but still I don't think that should be the reason as even when its not changing, and I awake my computer from sleep, within a few minutes, a lot of lines get accumulated in the file. Also, I'm pretty sure there's no "\n" at the end at any instant. – pymd Jun 20 '13 at 08:38

1 Answers1

1

you could also do it like this:

lines = open("filename", "r").readlines()
del lines[-1]

f = open("filename", "w")
for line in lines:
 f.write(line)

f.write("new content")
f.close()
cptPH
  • 2,401
  • 1
  • 29
  • 35
  • thanks for the much simpler way to do it. :). However, can you please tell why the above code is behaving in a strange manner? – pymd Jun 20 '13 at 08:32