2

I'm experiencing a weird error (or is it a bug?) with the python codecs module. I'm using Python 2.7.4.

Suppose I want to read the following file called foo:

0
aaaa
bbbb
cddd
dddddddd

Long sentence here which is not even read completely

The rest is ignored...

I use the following code for that:

import codecs
log = codecs.open('foo', encoding='utf8')
log.readline()
lines = log.readlines()

print ''.join(lines)

The result I get is

aaaa
bbbb
cddd
dddddddd

Long sentence here which is not even read com

As you see, the file is not read entirely!?!! Is there any explanation for that?

(The problem does not occur if I omit the call to "readline", or if I don't use any encoding... This is all very mysterious to me.)

Olivier Verdier
  • 46,998
  • 29
  • 98
  • 90
  • Yes, it's the same problem. I would like to see a reference to the codecs documentation, though. This behaviour of readline/readlines is completely insane, isn't it? – Olivier Verdier Aug 05 '13 at 08:48
  • 1
    The behavior observed here, and described in that link, is in contradiction with the official documentation for files, which states that a plain `file.readlines()` would read until EOF. It looks like a bug of codecs, or possibly a missing warning in the documentation if there is any reason for keeping it working that way. – Armin Rigo Aug 05 '13 at 08:52
  • @ArminRigo I have exactly the same impression. +1 – Olivier Verdier Aug 05 '13 at 09:08
  • 2
    For what it's worth, it's a known bug: http://bugs.python.org/issue8260 – Wooble Aug 05 '13 at 12:15
  • @Wooble brilliant! Basically, that's the answer I was looking for. Thanks! – Olivier Verdier Aug 05 '13 at 13:18

0 Answers0