3

genfromtxt can skip header and footer lines and speicfy which columns to use. But how can I control how many lines to read?

Sometimes a txt file might contain several blocks with different shape. For example,

a=StringIO('''
1,2,3
1,2,3
2,3
2,3
''')
genfromtxt(a,delimiter=',',skip_header=1)

This will raise an error,

ValueError: Some errors were detected !
    Line #4 (got 2 columns instead of 3)
    Line #5 (got 2 columns instead of 3)

Of couse, I can do it like this:

a=StringIO('''
1,2,3
1,2,3
2,3
2,3
''')
genfromtxt(a,delimiter=',',skip_header=1,skip_footer=2)

It's ugly as I have to calculate the number of rows under the block.

However I wish something like

genfromtxt(a,delimiter=',',skip_header=1,nrows=2)

that would be more clear.

Anyone have a good idea about that? Or use other function?


Update 2015 Oct

This question has been solved in new version of Numpy.

genfromtxt now have a new keywords named max_rows which allow one to control the number of lines to read, cf here.

Syrtis Major
  • 3,791
  • 1
  • 30
  • 40
  • `fromfile` offer a keyword `count` can control Number of items to read. However `fromfile` is less flexible than `genfromtxt` or `loadtxt` when reading txt file. – Syrtis Major Sep 19 '14 at 05:41

1 Answers1

2

You can use the invalid_raise = False to skip reading the lines that are missing some data. E.g.

b = np.genfromtxt(a, delimiter=',', invalid_raise=False)

This will give you a warning, but will not raise an exception.

abudis
  • 2,841
  • 6
  • 32
  • 41
  • 1
    If you want to filter that warning, you can add this, `import warnings` `warnings.simplefilter("ignore",UserWarning)` – Syrtis Major Sep 19 '14 at 15:05