1

I use:

Data = np.genfromtxt(filename, delimiter='"\t"', dtype=None, autostrip=True, skip_header=1)

And in proccess it gives me raise ValueError(errmsg)

Line #33 (got 3 columns instead of 27)

But it is not so. In file in that line I have all columns! I checked - function read those lines with "missing values" untill some symbol: For example, in line 33 function read this:

"http://www.savvyeat.com/whole-wheat-chocolate-chai-muffins/"   "2152"  "{""title"":""Whole Wheat Chocolate Chai Muffins Savvy Eats "",""body"":""I think I subconsciously sabotaged myself Two weeks ago I couldn t

How can I read my file in numpy array in other way or somehow fix this problem?

Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
Il'ya Zhenin
  • 1,272
  • 2
  • 20
  • 31

1 Answers1

1

This kind of problem should be quickly solvable once we can see the contents of the CSV file. To debug the problem run:

import itertools as IT
with open(filename, 'rb') as f:
    content = ''.join(IT.islice(f, 50))
    print(repr(content))

and post the output. That will give us the first 50 lines of the file. If there is sensitive data, you can redact it before posting; just leave the quotation marks and \t intact.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • Piece of one sentention: "the uk's #1 news portal", with your code it reads fine, but with getfromtxt fall before "#1". I delete sign # and left with "the uk's 1 news portal", and now genfromtxt read it all. – Il'ya Zhenin Aug 31 '13 at 14:33