0

I'm quite new to Python and... well... let's say, not really an expert when it comes to coding. So apologies for the very amateurish question in advance. I'm trying to merge several googletrends report.csv files to use for my research.

Two problems I encounter:

  1. The report files aren't just a spreadsheet but contain lots of other information that is irrelevant. I.e. I just want a certain array of each file to be merged (really just want the daily data containing the dates and the corresponding SVI for each month. Say: column 6 to 30)

  2. As the (daily) data will be extracted from monthly report file and months do not have a constant number of days I cannot just use fixed column numbers to be read but would need those to be according to the number of days the specific months has.

Many thanks for the help!

Edit:

The code I use:

import pandas as pd
report = pd.read_csv('C:/Users/paul/Downloads/report.csv', skiprows=4, skipfooter=17)
print(report)

The output it produces

I managed to cut the first few lines off but I don't know how to cut off the bottom bit from row 31 onwards. So skipfooter didn't seem to work. But I can't use nrows as the months don't have the same number of days, so I won't know the number of rows in advance.

Paul
  • 23
  • 4

1 Answers1

0

It turned out that it does help to occasionally read the warnings python gives.

ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support skip_footer; you can avoid this warning by specifying engine='python'.

The problem I had, that the skip_footer option didn't work, was apparently related to the c engine used.

For anyone running into the same issue, here's the code I solved it with:

import pandas as pd
report = pd.read_csv('C:/Users/paul/Downloads/report.csv', skiprows=4, skip_footer=27, engine='python')

print(report)

Just add engine='python' to get rid of the c engine problem. Don't ask me why I had to skip 27 rows in the end (I was pretty sure I counted 17), but with a bit of trial and error this just worked.

Paul
  • 23
  • 4