-2

In the documentation for Python's input/output, it states under Reading and Writing Files:

https://docs.python.org/3.5/tutorial/inputoutput.html#methods-of-file-objects

"When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned."

Let's take the following code:

size = 1000
with open('file.txt', 'r') as f:
    while True:
        read_data = f.read(size)
        if not read_data:
            break 
        print(read_data)   # outputs data in sizes equal to at most 1000 bytes

Here, size is at most 1000 bytes. What determines "at most"?

Let's say we are parsing rows of structured data. Each row is 750 bytes. Would read "cut off" the next row, or stop at the \n?

ShanZhengYang
  • 16,511
  • 49
  • 132
  • 234
  • When you use `f.read()`, you aren't *"parsing rows"*. `\n` is just another character, so yes it will cut off in a row. The idea of *"lines"* in a text file is largely fictional! – jonrsharpe Sep 15 '16 at 12:59
  • 1
    If you pass a number to read it reads that number of bytes. – Padraic Cunningham Sep 15 '16 at 12:59
  • @jonrsharpe Let's say we were parsing rows. It appears the only way to use this parameter would be to calculate the exact number of bytes per row (hoping that it is of equal byte size). Otherwise, how would you do this using f.read()? E.g. give back several complete rows – ShanZhengYang Sep 15 '16 at 13:06
  • 1
    If you were parsing rows, you wouldn't use `.read`, you'd use `for row in f:`. – jonrsharpe Sep 15 '16 at 13:06
  • @jonrsharpe Thanks. Let's say I were to use `subprocess` instead with `Popen()`---is there a `for row in f` analogue? – ShanZhengYang Sep 15 '16 at 14:55

1 Answers1

1

read is not readline or readlines. It just reads bytes regardless of the file content (apart from the end of line translation since your file is open as text)

  • If there's 1000 bytes to be read in the buffer, it returns 1000 bytes (or less if file has \r\n format (Windows CR+LF) and read as text, the \r chars are stripped)
  • If there's 700 bytes left, it returns 700 bytes (give or take the \r issue)
  • If there's nothing to read, it returns an empty buffer (len(read_data)==0).
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219