1

I am learning the I/O functions of genfromtxt in numpy. I tried an example from the user guide of numpy. It is about the comments argument of genfromtxt.

Here is the example from the user guide of numpy:

>>> data = """#
... # Skip me !
... # Skip me too !
... 1, 2
... 3, 4
... 5, 6 #This is the third line of the data
... 7, 8
... # And here comes the last line
... 9, 0
... """
>>> np.genfromtxt(StringIO(data), comments="#", delimiter=",")
[[ 1. 2.]
[ 3. 4.]
[ 5. 6.]
[ 7. 8.]
[ 9. 0.]]

I tried below:

data = """#                 \
    # Skip me !         \
    # Skip me too !     \
    1, 2                \
    3, 4                \
    5, 6 #This is the third line of the data    \
    7, 8                \
    # And here comes the last line  \
    9, 0                \
    """
a = np.genfromtxt(io.BytesIO(data.encode()), comments = "#", delimiter = ",")
print (a)

Result comes out:

genfromtxt: Empty input file: "<_io.BytesIO object at 0x0000020555DC5EB8>" warnings.warn('genfromtxt: Empty input file: "%s"' % fname)

I know the problem is with data. Anyone can teach me how to set the data as shown in the example? Thanks a lot.

Merlin
  • 24,552
  • 41
  • 131
  • 206
TCNO_C
  • 35
  • 7

2 Answers2

0

Try below. First, dont use "\". Second, why are you using .BytesIO() use StringIO()

import numpy as np
from StringIO import StringIO

data = """#                 
    # Skip me !     
    # Skip me too !     
    1, 2                
    3, 4                
    5, 6 #This is the third line of the data    
    7, 8                
    # And here comes the last line  
    9, 0                
    """

    np.genfromtxt(StringIO(data), comments="#", delimiter=",")

    array([[ 1.,  2.],
           [ 3.,  4.],
           [ 5.,  6.],
           [ 7.,  8.],
           [ 9.,  0.]])
Merlin
  • 24,552
  • 41
  • 131
  • 206
0

In a ipython3 (py3) interactive session I can do:

In [326]: data = b"""#
     ...: ... # Skip me !
     ...: ... # Skip me too !
     ...: ... 1, 2
     ...: ... 3, 4
     ...: ... 5, 6 #This is the third line of the data
     ...: ... 7, 8
     ...: ... # And here comes the last line
     ...: ... 9, 0
     ...: ... """
In [327]: 
In [327]: data
Out[327]: b'#\n# Skip me !\n# Skip me too !\n1, 2\n3, 4\n5, 6 #This is the third line of the data\n7, 8\n# And here comes the last line\n9, 0\n'
In [328]: np.genfromtxt(data.splitlines(),comments='#', delimiter=',')
Out[328]: 
array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.],
       [ 7.,  8.],
       [ 9.,  0.]])

In Python3, the string needs to be bytes; in Py2 that's the default.

With multiline string input (triple quotes) don't use \. That's a line continuation. You want to keep the \n

data = b"""
one
two
"""

Notice I could have also used:

data = '#\n# Skip me\n...'

with explicit \n.

genfromtxt works with any iterable that gives it lines. So I gave it a list of lines - produced with splitlines. The StringIO (or ByteIO in Py3) also works but it extra work.

Of course another option is to copy those lines to a text editor and save them as a simple text file. The copy-n-paste into the interactive session is a handy short cut, but not necessary.

In [329]: data.splitlines()
Out[329]: 
[b'#',
 b'# Skip me !',
 b'# Skip me too !',
 b'1, 2',
 b'3, 4',
 b'5, 6 #This is the third line of the data',
 b'7, 8',
 b'# And here comes the last line',
 b'9, 0']
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • .@hpaulj does StringIO behave differently in py3, vs py2, not a user of py3. – Merlin Jul 26 '16 at 05:02
  • Thank you for the multiline string input method. I have problems with triple quotes and "\" – TCNO_C Jul 26 '16 at 05:08
  • `genfromtxt` expects its input to be byte strings (no unicode). Py3 uses unicode strings as the default format, and `StringIO` handles the default (regardless of version). It's the 2nd most common difference between 2 and 3 (most common is the `print(...)` expression). – hpaulj Jul 26 '16 at 07:14