0

I put trailing print() methods right next to my write() method lines at the end of my code to test why my output files were incomplete. But, the print() output is "all the stuff" I expect; while the write() output is off by a confusing amount (only 150 out of 200 'things'). Reference Image of Output: IDLE versus external output file

FYI: Win 7 64 // Python 3.4.2

My modules take an SRT captions file ('test.srt') and returns a list object I create from it; in particular, one with 220 list entries of the form: [[(index), [time], string]]

times = open('times.txt', 'w')

### A portion of Riobard's SRT Parser: srt.py
import re

def tc2ms(tc):
    ''' convert timecode to millisecond '''

    sign    = 1
    if tc[0] in "+-":
        sign    = -1 if tc[0] == "-" else 1
        tc  = tc[1:]

    TIMECODE_RE     = re.compile('(?:(?:(?:(\d?\d):)?(\d?\d):)?(\d?\d))?(?:[,.](\d?\d?\d))?')
    match   = TIMECODE_RE.match(tc)
    try: 
        assert match is not None
    except AssertionError:
        print(tc)
    hh,mm,ss,ms = map(lambda x: 0 if x==None else int(x), match.groups())
    return ((hh*3600 + mm*60 + ss) * 1000 + ms) * sign

# my code
with open('test.srt') as f:
    file = f.read()

srt = []

for line in file:
    splitter = file.split("\n\n")

# SRT splitter
i = 0
j = len(splitter)
for items in splitter:
    while i <= j - 2:
        split_point_1 = splitter[i].index("\n")
        split_point_2 = splitter[i].index("\n", split_point_1 + 1)
        index = splitter[i][:split_point_1]
        time = [splitter[i][split_point_1:split_point_2]]
        time = time[0][1:]
        string = splitter[i][split_point_2:]
        string = string[1:]
        list = [[(index), [time], string]]
        srt += list
        i += 1

# time info outputter
i = 0
j = 1
for line in srt:
    if i != len(srt) - 1:
        indexer = srt[i][1][0].index(" --> ")
        timein = srt[i][1][0][:indexer]
        timeout = srt[i][1][0][-indexer:]
        line_time = (tc2ms(timeout) - tc2ms(timein))/1000
        space_time = ((tc2ms((srt[j][1][0][:indexer]))) - (tc2ms(srt[i][1][0][-indexer:])))/1000
        out1 = "The space between Line " + str(i) + " and Line " + str(j) + " lasts " + str(space_time) + " seconds." + "\n"
        out2 = "Line " + str(i) + ": " + str(srt[i][2]) + "\n\n"
        times.write(out1)
        times.write(out2)
        print(out1, end="")
        print(out2)
        i += 1
        j += 1
    else:
        indexer = srt[i][1][0].index(" --> ")
        timein = srt[i][1][0][:indexer]
        timeout = srt[i][1][0][-indexer:]
        line_time = (tc2ms(timeout) - tc2ms(timein))/1000
        outend = "Line " + str(i) + ": " + str(srt[i][2]) + "\n<End of File>"
        times.write(outend)
        print(outend)

My two write() method output files, respectively, only print out either ~150 or ~200 items of the 220 things it otherwise correctly prints to the screen.

  • `print()` is implemented in terms of `write` calls to the `stdout` file. Something else is wrong, `write()` is not limited. – Martijn Pieters May 06 '15 at 14:45
  • This is too large a piece of code for us to help debug. Can you please reduce this to a smaller sample that still reproduces the issue? – Martijn Pieters May 06 '15 at 14:47
  • You're not closing or flushing your your files, so some of the data is still in the buffer at the end. That's why always use `with` statement when dealing with files, it will call `close()` for you. – Ashwini Chaudhary May 06 '15 at 15:10
  • @MartijnPieters; sorry about that. I pared it down to the essentials (still works, still produces problem). – Matt Varner May 06 '15 at 15:37
  • @AshwiniChaudhary: Thanks for responding. I'm using `with` when opening the source file and it's closing. Are you saying that I should use `with` also with my output file? I guess like `with open('times.txt') as times:` etc etc? – Matt Varner May 06 '15 at 15:40
  • Use `with` for all files, specially files opened in write mode. – Ashwini Chaudhary May 06 '15 at 15:46
  • Not sure if this makes it more clear or not. [Reference to output](http://i.imgur.com/CnBtx74.jpg) Oh! Thanks. @AshwiniChaudhary I will do that right now. And I'll look up 'buffer' while I'm at it. Appreciate it. – Matt Varner May 06 '15 at 15:47
  • @MattVarner: the reference output is actually helpful in that it helped me see much more clearly where your print and write calls are. Maybe add a link to that image in your question. Yes, you need to either explicitly flush, close your files or use them as context managers in a `with` statement (which closes for you, and closing flushes). – Martijn Pieters May 06 '15 at 16:08
  • @MartijnPieters - Done. Thanks again. I was using syntax from Python Programming for the Absolute Beginner, 3rd Ed., and in all these tutorial examples, you're never dealing with lots and lots of output. This was my first contest against the 'buffer.' :: sheepish grin :: – Matt Varner May 07 '15 at 07:57

1 Answers1

1

You want to close your times file when done writing; operating systems use write buffers to speed up file I/O, collecting larger blocks of data to be written to disk in one go; closing the file flushes that buffer:

times.close()

Consider opening the file in a with block:

with open('times.txt', 'w') as times:
    # all code that needs to write to times
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343