2

I'm writing a Python script which processes a text file. I expect to process files generated from different people, working under different operating systems. Is there a nice way to figure out which OS created the text file, and specify the end-of-line convention to make parsing line-by-line trivial?

tshepang
  • 12,111
  • 21
  • 91
  • 136
ajwood
  • 18,227
  • 15
  • 61
  • 104

4 Answers4

3

Use universal newline mode when opening the file.

with open('input.txt', 'rU') as fp:
  for line in fp:
    print line
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
2

splitlines() handles various line terminators:

>>> 'foo\nbar'.splitlines()
['foo', 'bar']
>>> 'foo\rbar'.splitlines()
['foo', 'bar']
>>> 'foo\r\nbar'.splitlines()
['foo', 'bar']
mechanical_meat
  • 163,903
  • 24
  • 228
  • 223
1

If you do not care about ending white space then:

for line in [l.rstrip() for l in open('test.py').read().split('\n')]:
    print line

'\n' will take care of Linux / Mac and rstrip will eat up any '\r' from Windows.

nate c
  • 8,802
  • 2
  • 27
  • 28
  • This gives me an error: AttributeError: 'list' object has no attribute 'rstrip' – ajwood Dec 04 '10 at 02:09
  • Mauahah got it! I had to change split('\n') to split('\r') though. Thanks very much! – ajwood Dec 04 '10 at 02:17
  • Oh darn, there's one more thing... I've got to do something special with the 1st line of the file, then process the rest... – ajwood Dec 04 '10 at 02:19
1

You want to use file.readlines(), which returns a list containing the lines in the file.

lines = open('info.txt').readlines()
for line in lines:
    print line

See the documentation on Python file objects.

Velociraptors
  • 2,012
  • 16
  • 22