2

Scenario: Following up from a previous question (Removing the timestamp from a datetime in pandas dataframe) I have a code that reads data from excel into a pandas dataframe and uses the command dataframe.to_sql to insert that data into a given SQL database.

Problem: Since my dates are retrieved in the american format (mm/dd/yyyy) and my database is in the international format (dd/mm/yyyy), I get a type error trying to upload my data to SQL.

What I already tried: I tried the suggestion given in the previous question and in that way the data changing runs, but I get the error when I try to upload to SQL.

Line that I am currently using:

fnl['Date'] = pd.to_datetime(fnl['Maturity'], errors='coerce')

which allows the code to run, but yields the problem in SQL.

I also tried:

fnl['Date'] = pd.to_datetime(fnl['Date'], format='%m/%d/%Y', errors:'coerce')

and

fnl['Date'] = pd.to_datetime(fnl['Date'], format='%m/%d/%Y')

Still with no success.

I previously changed all my encoding to utf-8 in the beginning of the code, so I can't see where the problem lies.

Question: How can I solve this problem?

Data example:

Date
1/15/2023
1/15/2023
6/30/2023
6/30/2023
8/1/2022
8/1/2022
7/25/2022
7/25/2022
7/19/2024
7/13/2022
7/13/2022
cs95
  • 379,657
  • 97
  • 704
  • 746
DGMS89
  • 1,507
  • 6
  • 29
  • 60
  • I associate UTF-8 with MySQL, so I removed the SQL Server tag. You should tag with the database you are really using. – Gordon Linoff Sep 14 '17 at 11:51
  • You have incorrect syntax in the second line. Fix that and see what happens. Also, please include your error messages here so we understand what your problem is. – cs95 Sep 14 '17 at 11:52
  • If we can see what your column looks like, that would help. I think your format is wrong. – cs95 Sep 14 '17 at 11:56
  • @GordonLinoff You are right, I did not know the difference, thanks for the edit. – DGMS89 Sep 14 '17 at 11:57
  • @cᴏʟᴅsᴘᴇᴇᴅ I am going to update the question with a part of the column data. And regarding the error, forgive my incapacity, but I cannot find it. Care to specify? – DGMS89 Sep 14 '17 at 11:58
  • @DGMS89 If my answer doesn't help, you'll need to include your string column and your traceback for further debugging. – cs95 Sep 14 '17 at 11:59

1 Answers1

2

If your Date column returns dates in international format, you'll want to reflect that when you're converting to datetime.

fnl['Date'] = pd.to_datetime(fnl['Date'], format='%m/%d/%Y', errors='coerce')

You'll need to specify the format in the source so that the conversion can be done properly.


If you want to convert your datetime back to a string in the %m/%d/%Y format, use the .dt.strftime function:

fnl['DateString'] = fnl['Date'].dt.strftime('%d/%m/%Y')

s = pd.to_datetime(df['Date'], format='%m/%d/%Y', errors='coerce')
s

0    2023-01-15
1    2023-01-15
2    2023-06-30
3    2023-06-30
4    2022-08-01
5    2022-08-01
6    2022-07-25
7    2022-07-25
8    2024-07-19
9    2022-07-13
10   2022-07-13
Name: Date, dtype: datetime64[ns]

s = s.dt.strftime('%d/%m/%Y')
s

0     15/01/2023
1     15/01/2023
2     30/06/2023
3     30/06/2023
4     01/08/2022
5     01/08/2022
6     25/07/2022
7     25/07/2022
8     19/07/2024
9     13/07/2022
10    13/07/2022
Name: Date, dtype: object
cs95
  • 379,657
  • 97
  • 704
  • 746
  • Thanks for the answer. I changed the error after coerce and specified the format as it comes from the source data. The code runs as before, but my data is then changed to "NaT" (all the values in the column). – DGMS89 Sep 14 '17 at 12:02
  • @DGMS89 Added demo. – cs95 Sep 14 '17 at 12:08
  • I see your example, and understand there is a difference when using those functions. What would be the reason for my data being changed to NaT? – DGMS89 Sep 14 '17 at 12:09
  • @DGMS89 If your data does not adhere to the format, or if it is simply not a parsable date string, it is "coerced" to NaT. – cs95 Sep 14 '17 at 12:10
  • Understood. I tried changing the format to match an error python yielded, and got the following result: "ValueError: time data u'--' doesn't match format specified". I also tried changing errors='ignore', that seemed to work, but when I use (fnl['Maturity'] = fnl['Maturity'].dt.date) to remove the time from the date, I get the following error: (AttributeError: Can only use .dt accessor with datetimelike values). Any ideas why that might be happening? – DGMS89 Sep 14 '17 at 12:16
  • @DGMS89 The type of `fnl['Maturity']` needs to be `datetime` or the like, otherwise you cannot use `.dt`. – cs95 Sep 14 '17 at 12:17
  • It is strange, if I remove the '%m%d%Y', and set the coerce, the data gets changed properly and the returns a datetime64[ns] object. But when I export that to sql I get the error. I just don't understand what is going on. – DGMS89 Sep 14 '17 at 12:19
  • @DGMS89 Mmh, sorry, I don't know SQL and I don't know what errors you're getting, so I couldn't tell you what the problem was. Check your encoding. – cs95 Sep 14 '17 at 12:21