0

I have got the following error:

TypeError: parse() takes 1 positional argument but 2 were given

I was trying to do a basic data preparation where I wanted to parse the date-time information as a Panda DataFrame index (combine the 'date' and 'time' columns together in a single column). This is a snippet of the code:

from pandas import read_csv
from datetime import datetime
def parse(x):
    return datetime.strptime(x,'%d-%b-%y %H:%M:%S' )

dataset = read_csv("dataset.csv", header=0, parse_dates = [['date', 'time']],
                   index_col=0, date_parser= parse)

This is how the original date and time look like:

date          time
25-Apr-17   19:19:40
25-Apr-17   19:19:40
25-Apr-17   19:19:45
25-Apr-17   19:19:45

I came across an alternative way to use:

dataset = read_csv("dataset.csv", header=0, parse_dates = {'datetime':[1,2]},
                   index_col=0, date_parser=lambda x: datetime.strptime(x,'%d-%b-%y %H:%M:%S' )

But still the same issue. TypeError: <lambda>() takes 1 positional argument but 2 were given

I was wondering if you guys could help me with this issue?

Amhs_11
  • 233
  • 3
  • 10
  • Does this answer your question? [Can pandas automatically recognize dates?](https://stackoverflow.com/questions/17465045/can-pandas-automatically-recognize-dates) – FObersteiner May 27 '20 at 06:08
  • by the way, I think you don't need to specify the date_parser at all, pandas read_csv does it correctly on its own. if you have to anyway, use `date_parser=lambda x, y: datetime.strptime(x+' '+y,'%d-%b-%y %H:%M:%S'))`. You'll need two variables since you pass two columns to the function. – FObersteiner May 27 '20 at 06:14
  • Thanks @MrFuppes, I had a look at it and used some idea. Still got some error like ```ValueError: unconverted data remains: ```. I used ```date_parser=lambda x, y: datetime.strptime(x+' '+y,'%d-%b-%y %H:%M:%S') ```, and I got this error ```ValueError: time data ' 19:19:40 9' does not match format '%d-%b-%y %H:%M:%S' ``` – Amhs_11 May 27 '20 at 06:25
  • did you check the "time" column in the .csv? does it contain irregular format such as sometimes ' 19:19:40 9'? sounds like a little cleanup us required before parsing – FObersteiner May 27 '20 at 06:39
  • I double checked, it looks like '9' comes from the next column in the dataset, which next to the time column. I used ```delimiter=","``` but it doesn't solve the issue. Still got the same error. – Amhs_11 May 27 '20 at 09:50
  • you mean in the .csv it is correctly separated by `,` and `pandas.read_csv` parses incorrectly? – FObersteiner May 27 '20 at 10:08
  • Yes, in the .csv .Yes, I think for this error: ``` ValueError: time data ' 19:19:40 9' does not match format '%d-%b-%y %H:%M:%S' ``` the value '9' comes from the column beside the 'time' column as follow: 'date','time','col1','col2','col3' where 9 comes from 'col1'. – Amhs_11 May 27 '20 at 10:29

2 Answers2

2

You get this error because you have selected two columns and the date parser is waiting for a single value.

If you want to parse timestamp manually you have to use the following example:

def parse(x, y):
    return datetime.strptime(f"{x} {y}", "%d-%b-%y %H:%M:%S")

dataset = read_csv("dataset.csv", header=0, parse_dates = [["date", "time"]],
                   index_col=0, date_parser=parse)

dataset

                     Unnamed: 0
date_time                      
2017-04-25 19:19:40           0
2017-04-25 19:19:40           1
2017-04-25 19:19:40           2
Panagiotis Simakis
  • 1,245
  • 1
  • 18
  • 45
1

After a couple of a trail and error, I eventually managed to resolve the issue. I used pd.to_datetime instead of a datetime.strptime.

from pandas import read_csv
from datetime import datetime
import pandas as pd
def parse(d, t):
    dt = d+ " " +t
    return pd.to_datetime(dt)

dataset = read_csv("dataset.csv", header=0, parse_dates={'datetime': ['date', 'time']},
                   index_col=0, date_parser= parse)

The output:

datetime
2017-04-25 19:19:40
2017-04-25 19:19:40
2017-04-25 19:19:45
2017-04-25 19:19:45

I double checked the datatype of the 'date' and 'time', they were of datatype 'object'. I am not sure if this method would work with other datatypes such as a string but it solves my problem.

Thank you everyone for your participation.

Amhs_11
  • 233
  • 3
  • 10