0

Trying to create a dictionary with the key's being the first row of the csv file and the value's being a dictionary of {first column: corresponding column to row}:

import pandas as pd

df = pd.read_csv('~/StockMachine/data_stocks.csv', index_col=['DATE'], sep=',\s+')

data = df.to_dict()

print(data)

However, I get this error "ValueError: Index DATE invalid".

Traceback:

  File "/Users/cs/StockMachine/stockmachine.py", line 4, in <module>
    df = pd.read_csv('~/StockMachine/data_stocks.csv', index_col=['DATE'], sep=',\s+')
  File "/Users/cs/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 678, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Users/cs/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 446, in _read
    data = parser.read(nrows)
  File "/Users/cs/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 1036, in read
    ret = self._engine.read(nrows)
  File "/Users/cs/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 2273, in read
    index, columns = self._make_index(data, alldata, columns, indexnamerow)
  File "/Users/cs/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 1425, in _make_index
    index = self._get_simple_index(alldata, columns)
  File "/Users/cs/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 1457, in _get_simple_index
    i = ix(idx)
  File "/Users/cs/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 1452, in ix
    raise ValueError('Index %s invalid' % col)

data_stocks.csv: CSV File

smci
  • 32,567
  • 20
  • 113
  • 146
csanders1999
  • 25
  • 1
  • 7
  • 1
    are you sure this `sep=',\s+'` is correct? If the csv is Comma separated, use `sep=','` so the read_csv() function can parse the file properly – Theo Jul 30 '18 at 07:43
  • Could you provide a subset of your csv that reproduces the error? – Luca Cappelletti Sep 29 '19 at 07:06
  • `sep=',\s+'` is wrong. The separator(/delimiter) is either comma or whitespace - not both. If you specify comma, pandas will correctly ignore whitespace. – smci Feb 11 '20 at 00:39

2 Answers2

2

Similiar thing happened to me and in my case some readings of ['DATE'] were strings with empty spaces inside. Maybe if you would do something like:

import pandas as pd

df = pd.read_csv('~/StockMachine/data_stocks.csv', sep=',\s+')

df['DATE'] = df['DATE'].apply(lambda x: str(x.strip())).astype(str)

df.set_index('DATE', inplace=True)

print(df.head())
0

I had same issue then realized it is that the selected column for the Col_Index is not part of the selected Header=1 row specified in my script

  • This does not really answer the question. If you have a different question, you can ask it by clicking [Ask Question](https://stackoverflow.com/questions/ask). To get notified when this question gets new answers, you can [follow this question](https://meta.stackexchange.com/q/345661). Once you have enough [reputation](https://stackoverflow.com/help/whats-reputation), you can also [add a bounty](https://stackoverflow.com/help/privileges/set-bounties) to draw more attention to this question. - [From Review](/review/late-answers/30969856) – taylor.2317 Feb 06 '22 at 21:33