0

Suppose I want to change the first two characters of the second line of the text file text_file to XX.

Say text_file has contents:

line1
line2
line3

I wrote the following script to accomplish the task:

f = open('text_file', 'r+')
f.readline()
f.write('XX')
f.close() # this flushes changes to disk implicitly

When I run this code in Python 2.7.9, this works fine, altering text_file so that it becomes:

line1
XXne2
line3

However, when I run it in Python 3.4.3, text_file ends up having these contents:

line1
line2
line3
XX

Now, if I alter the code like this:

f = open('text_file', 'r+')
f.readline()
f.seek(f.tell()) # shouldn't this be doing nothing?
f.write('XX')
f.close() # this flushes changes to disk implicitly

And run it in Python 3.4.3, the results are as desired.

I seriously don't understand what's going on here: why does write() start writing at that unexpected position[*]? And seeking to the current position isn't supposed to be changing anything, right?

Hope somebody can shed some light on this for me, thanks!

UPDATE: It's probably worth mentioning the OS I'm using. It's Xubuntu 15.04, Linux kernel 3.19.

UPDATE 2: As suggested by @cdarke, opening in binary mode rb+ (and accordingly writing binary strings) makes the script work in Python 3.4.3 without the use of seek as I did above. Why does this work, and not my original way?

(*): NOTE: That it starts writing at the end of the buffer is a matter of coincidence. I know this because the actual text file I'm working with is much larger, and there it didn't start writing at the end of the buffer.

  • 1
    Unable to reproduce on OS X (2.7.10 and 3.5). Try opening in binary mode. – cdarke Nov 11 '16 at 16:56
  • @cdarke Thanks for checking and for your suggestion. Changing open mode to `rb+` does the trick. Any idea why this makes a difference and why text mode does not work? –  Nov 11 '16 at 18:06
  • The documentation implies that what you are trying is undefined, although it doesn't actually directly show your example: "*In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with seek(0, 2)) and the only valid offset values are those returned from the f.tell(), or zero. Any other offset value produces undefined behaviour.*". In https://docs.python.org/3/tutorial/inputoutput.html – cdarke Nov 11 '16 at 20:19
  • It was that para that made me suggest binary access, but I could not test it because it worked for me - hence only a comment and not an answer. I have also had to use the `f.seek(f.tell())` trick with EOF operations (like emulating `tail -f`). – cdarke Nov 11 '16 at 20:23
  • @cdarke I've read that part of the documentation, but I fail to see what your citation has to do with my problem. I don't call `seek` anywhere in the original problem, so how can its behavioral differences between binary and text mode be relevant to the problem? –  Nov 11 '16 at 20:30
  • I agree that it does not show your exact issue, but what it implies is that the file position is undefined in text mode (other than the documented exceptions). When reading and writing to the same file it is usual to use random access and binary mode, since text mode is not suitable for random access. So I am saying that this is expected behaviour, however I agree that it would be better if it was explicitly documented. – cdarke Nov 12 '16 at 07:35

0 Answers0