Load only up to a number of lines when loading a csv file in numpy

Question

When working with large datasets, I often want to test my code with very few samples. This allows me to spot bugs before investing a long time in calculations.

One of the time consuming steps is often reading the data, so it's nice that Pandas lets me specify nrows to read the file up to a certain line, then stop. I don't care about accuracy, but about code bugs.

I don't seem to find a similar functionlity when using numpy directly, either with getfromtxt or loadtxt. Am I overseeing something? I'll go ahead and look into it myself if it's not available, but I thought I'd check with you guys first. Thanks!

You could just still use `nrows` using pandas and then access the underlying numpy array by calling [`.values`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.values.html#pandas.DataFrame.values) attribute of the dataframe — EdChum, Oct 26 '14 at 19:20
Does [this answer](http://stackoverflow.com/a/13663832/3923281) solve the issue for you? — Alex Riley, Oct 26 '14 at 19:20
@EdChum Thanks. That'd work but Pandas will still try to figure out column types and the like when loading. True, if using only a few rows it won't make a difference. Hm. That'd work. Thanks! — Miquel, Oct 26 '14 at 19:21
@ajcr Yes it does, `itertools.islice` handles this nicely, if not pretty-ly. Also, this question is a duplicate. I didn't find the one you quoted, thanks! — Miquel, Oct 26 '14 at 19:23
@ajcr will you be posting this as answer? If you don't I will do it myself so as to declare the question closed. And thanks! — Miquel, Oct 26 '14 at 19:43
@Miquel: No problem! I hadn't written anything further, so happy for you to post the answer that you feel fits your question best. — Alex Riley, Oct 26 '14 at 19:52

Load only up to a number of lines when loading a csv file in numpy

0 Answers0