4

Could anyone explain why the following problem occurs. In Python 2.7.12 when reading, writing and subsequently reading again from a file, python seems to write garbage.

for example when running this in python IDLE:

import os 
testPath = r"myTestFile.txt"

## Make sure the file exists and its empty
with open(testPath,"w") as tFile:
    tFile.write("")

print "Our Test File: ", os.path.abspath(testPath )

with open(testPath, "r+") as tFile:
    ## First we read the file 
    data = tFile.read()

    ## Now we write some data 
    tFile.write('Some Data')

    ## Now we read the file again
    tFile.read()

When now looking at the file the data is the following:

Some Data @ sb d Z d d l m Z d d d ・ ・ YZ e d k r^ d d l m Z e d d d d e ・n d S( s9
Implement Idle Shell history mechanism with History

...

What it seems like to me is that the read function starts to read from the end of the file and into the memory of whatever object comes after but this is just pure speculation.

I looked up the text that seems to be printed into the text file and this data comes from:

Python27\Lib\idlelib\IdleHistory.py

Now I know i should use a seek(0) after the write function to resolve the problem. But I am interested in why this problem occurs.

Why does the last read() call seem to write data to the file.

Thanks!

  • note i tested it in Python 2.7.12 and Python 3.5.2 in python 3.5.2 it seems fixed and it does not write any garbage data.
Jan
  • 91
  • 5
  • On some OS's (Windows), reading and writing without a `seek` in between is not valid. It seems a bit sketchy that it's writing data from another part of memory, but if you're invoking undefined behavior at the C library level, just about anything can happen. At least no [demons are flying out of your nose](http://www.catb.org/jargon/html/N/nasal-demons.html). – Blckknght Nov 02 '16 at 08:11
  • 1
    I cannot reproduce the issue on Python 2.7 or 3.5 on OS X. Could well be Windows specific. What happens if the file is opened as binary? – cdarke Nov 02 '16 at 08:17
  • cdrake: Interesting that it indeed seems to be windows related. when using r+b mode i get the same result. – Jan Nov 02 '16 at 08:58
  • 1
    Definitely confirmed on Windows 7 here with Python 2.7. In fact, I got a large chunk of the python code itself written to the file before a full page of nonsense. – roganjosh Nov 02 '16 at 09:11
  • 1
    It surprises me that this has not been noticed before, I searched and could not find it. I have been looking at the 2.7 source code and could only see issues around line endings (hence the 'b' suggestion). It looks like a buffer overrun. I'll look for differences in the Python 3 source code. – cdarke Nov 02 '16 at 10:47
  • Small note. I noticed when using a .flush() between the write and last read it also works fine. I tried to look at the C code but that is still a bit of a puzzle. – Jan Nov 04 '16 at 08:40
  • @Jan - sounds like you've found a bug, care to [file it](https://docs.python.org/2/bugs.html#using-the-tracker)? – dimo414 Mar 11 '17 at 02:17
  • @dimo414 - Sure my pleasure. – Jan Mar 15 '17 at 09:22
  • 2
    Here's the [issue](http://bugs.python.org/issue29817) that Jan opened. – Eryk Sun Mar 15 '17 at 14:17

0 Answers0