-1

I have 2 rows of data that look like the following (rows 8 and 9) ....

 2018-01-03T10:14:32.000Z
 2018-01-03T09:40:35.000Z
 2018-01-03T10:17:13.000Z
 2018-01-03T10:00:39.000Z
 2018-01-03T10:16:53.000Z
 2018-01-03T09:54:24.000Z
 2018-01-03T10:18:37.000Z
 2018-01-03T10:19:54.000Z
 2018-01-03T09:52:40.000Z
 2018-01-03T10:14:49.000Z
 2018-01-03T10:16:35.000Z

Code:

df = pd.read_csv('Plaku_City_Service_Requests_in_2018.csv', 
                 usecols = [8,9],
                 names = ['add', 'fix'])

df['delta'] = df['add'] - df['fix']

I am getting errors

There are 330,000 entries in this CSV file .. how do I find the timedeltas between these 2 columns?

I have these two columns stored in variables add and fix.. cant figure out how to compare.

Any help would be great, Thanks!

Prune
  • 76,765
  • 14
  • 60
  • 81
  • I'm not clear on what you're trying to do. You refer to both two rows and two columns of data. You've posted something that looks like a single column, that being a time stamp. What sort of value do you expect to see as the difference between a date and a time-of-day? – Prune Feb 25 '19 at 22:29
  • Your basic request seems to be a direct time difference of some sort. Time difference is covered quite well in on-line tutorials dealing with the `datetime` package. Where is the code that has you stuck? – Prune Feb 25 '19 at 22:30
  • import datetime as datetime import pandas as pd df=pd.read_csv('Plaku_City_Service_Requests_in_2018.csv', usecols = [8,9], names = ['add', 'fix']) df['delta'] = df['add'] - df['fix'] – Nick Esposito Feb 25 '19 at 22:52
  • Edit your clarifications into the question, please. As you can see, code does not appear nicely in comments. – Prune Feb 25 '19 at 22:54
  • I tried the code in the answer below and I got more errors – Nick Esposito Feb 25 '19 at 22:58

1 Answers1

1

It would seem appropriate to read both columns from the CSV into one DataFrame, rather than two separate ones:

df = pd.read_csv('2018.csv', usecols=[8, 9], names=['add', 'fix'])

If Pandas correctly infers that the type of your data is datetime, then finding the deltas is as simple as:

df['delta'] = df['add'] - df['fix']

If however they are inferred as strings, you will need to explicitly convert to datetime objects before the subtraction:

df['delta'] = pd.to_datetime(df['add']) - pd.to_datetime(df['time'])
sjw
  • 6,213
  • 2
  • 24
  • 39