18

In Python, you can read a file and load its lines into a list by using

f = open('file.txt','r')
lines = f.readlines()

Each individual line is delimited by \n but if the contents of a line have \r then it is not treated as a new line. I need to convert all \r to \n and get the correct list lines.

If I do .split('\r') inside the lines I'll get lists inside the list.

I thought about opening a file, replace all \r to \n, closing the file and reading it in again and then use the readlines() but this seems wasteful.

How should I implement this?

greye
  • 8,921
  • 12
  • 41
  • 46
  • Actually, if you have a mix of `\n` and `\r` newlines, and if the latter occur within the "real" lines separated by `\n`, then getting lists inside the list appears to me to be the Right Thing. – Tim Pietzcker Nov 23 '09 at 19:09

2 Answers2

44
f = open('file.txt','rU')

This opens the file with Python's universal newline support and \r is treated as an end-of-line.

abyx
  • 69,862
  • 18
  • 95
  • 117
Ned Deily
  • 83,389
  • 16
  • 128
  • 151
  • 4
    ...although this feature is deprecated and should not be used in new code according to the Python documentation. – Tim Pietzcker Nov 23 '09 at 19:07
  • Thanks! This works as intended and is sufficient for me. Tim, what would be the correct way to do it now? – greye Nov 23 '09 at 19:10
  • 3
    In Python 3.x universal newline support is on by default, so you don't have to do anything. – Jason Orendorff Nov 23 '09 at 19:22
  • this isn't working for me unfortunately every time I call readlines() on mac my list's length is always 1! – fIwJlxSzApHEZIl Feb 21 '14 at 01:45
  • @advocate, What version of Python? You should probably open a new question with all the details of your problem. – Ned Deily Feb 21 '14 at 01:48
  • 2
    found the problem. readlines() perpetually fails for me but read().split("\r") performs the same functionality and works. also .split("\n") fails perpetually for me on very file as well for probably the same reason that .readlines() doesn't work. This took me days to figure out! this is my first time working professionally on a mac and tripped me up for a long time unfortunately. this is python 2.7.5 on OSX 10.9.1. Opening the file for 'rU' did NOT fix my newline / carriage return problem unfortunately. – fIwJlxSzApHEZIl Feb 21 '14 at 18:56
3

If it's a concern, open in binary format and convert with this code:

from __future__ import with_statement

with open(filename, "rb") as f:
    s = f.read().replace('\r\n', '\n').replace('\r', '\n')
    lines = s.split('\n')
hughdbrown
  • 47,733
  • 20
  • 85
  • 108