0

Having trouble loading in this dataset for my Business Intelligence class. I tried a different csv file and that worked. Tried googling some solutions but couldn't figure it out. Any help would be greatly appreciated!

# load data

col_names = ['age', 'gender', 'coffee_bags_bought', 'spent_last_week', 'spent_last_month', 'income', 'online', 'new_product']
# load dataset
coffeeStore = pd.read_csv("/content/CoffeeStore.xlsx", header=None, names=col_names)
coffeeStore.head(2)

This is the error I'm running into:

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-35-e3969313ee59> in <module>()
      3 col_names = ['age', 'gender', 'coffee_bags_bought', 'spent_last_week', 'spent_last_month', 'income', 'online', 'new_product']
      4 # load dataset
----> 5 coffeeStore = pd.read_csv("/content/CoffeeStore.xlsx", header=None, names=col_names)
      6 coffeeStore.head(2)

9 frames
/usr/local/lib/python3.7/dist-packages/pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error()

UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 15-16: invalid continuation byte
  • You're calling `read_csv`, which expects a **c**omma-**s**eparated **v**alue plain text file, but you're feeding it an XLSL file, which is a binary file. So that's not going to end well. Use the excel loading function instead. – Mike 'Pomax' Kamermans May 04 '22 at 22:33
  • You're using `read_csv` on an excel file. Use `read_excel` instead – Freddy Mcloughlan May 04 '22 at 22:33

2 Answers2

0

You're using read_csv on an excel file. Use read_excel instead

coffeeStore = pd.read_excel("/content/CoffeeStore.xlsx", header=None, names=col_names)
Freddy Mcloughlan
  • 4,129
  • 1
  • 13
  • 29
0

You can also change engine parameter to 'python'

coffeeStore = pd.read_csv("/content/CoffeeStore.xlsx", header=None, names=col_names,engine='python')

For more detailed explanation about unicode, utf-8 etc. read this legendary blog post

Sam Oz
  • 106
  • 6