0

I am trying to read a .csv file which has 1048575 rows and 3 columns, there might be some missing indexes.

The .csv file looks like this

CSV file

I wrote this line of code to read the .csv file using Pandas

df=pd.read_csv('GHI.csv',thousands=r',',sep=';',usecols=['Time','Value']).

The dataframe which is, however, read incorrectly as the length of the dataframe and the tail of it shows me incorrect timestamps.

DataFrame Screenshot

Can someone please let me know how to parse this file properly and make the timestamps timezone aware.

Thanks, Debayan

Code Pope
  • 5,075
  • 8
  • 26
  • 68
Debayan Paul
  • 95
  • 2
  • 9
  • parse the dates using the [parse_dates](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#date-handling) argument in read_csv. – sammywemmy Apr 15 '20 at 11:06
  • Do you mean adding this to the argument list parse_dates=['Time'],infer_datetime_format=True – Debayan Paul Apr 15 '20 at 11:19
  • Anyway, I tried with the arguments, still, it is the same. – Debayan Paul Apr 15 '20 at 11:26
  • I shared a link to the docs, that extensively explains how to go about parsing dates. kindly have a look at it. if infer fails, you can write a custom function, or pass a format for how pandas should read in the dates. but the docs i bliv will guide u much better – sammywemmy Apr 15 '20 at 11:27
  • Hi, I tried several combinations but none of them worked. Can you please look into the two screenshots I attached and tell me where I might be going wrong? Also, the line is taking almost 5-6 minutes to execute every time. – Debayan Paul Apr 15 '20 at 11:51
  • can u share what u tried(data, not pics), let's see what the error is. include ur code as well. – sammywemmy Apr 15 '20 at 11:56
  • I tried this: df= pd.read_csv('GHI.csv',thousands=r',',sep=';',usecols=['Time','Value'],parse_dates=['Time'],quotechar='"') and also df= pd.read_csv('GHI.csv',thousands=r',',sep=';',usecols=['Time','Value'],parse_dates=['Time'],quotechar='"',date_parser=pd.io.date_converters.parse_date_time) – Debayan Paul Apr 15 '20 at 12:02
  • not sure what else i can suggest apart from what is in the docs. hopefully someone else can suggest sth to solve the challenge – sammywemmy Apr 15 '20 at 12:31

0 Answers0