0

I have a pandas array with a column which contains unix timestamp times, but I think it's in milliseconds because each time as 3 extra 0's at the end. For example, the first data point is 1546300800000, when it should be just 1546300800. I need to convert this column to readable times so right now I have:

df = pd.read_csv('data.csv')
df['Time] = pd.to_datetime(df['Time'])
df.to_csv('data.csv', index=False)

Instead of giving me the correct time it gives me a time in 1970. For example 1546300800000 gives me 1970-01-01 00:25:46.301100 when it should be 2019-01-01 00:00:00. It does this for every timestamp in the column, which is over 20K rows

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
mbohde
  • 109
  • 1
  • 7
  • Do not use milliseconds with timestamps. False sense of precision (now we are off 18 seconds, or "better" 18000 ms). If you nees millisecond precision, do not use timestamps – Giacomo Catenazzi Oct 26 '20 at 05:45
  • @GiacomoCatenazzi: why should an integer timestamp not be able to represent millisecond precision? – FObersteiner Oct 26 '20 at 07:02
  • 1
    @MrFuppes: it just false sense of precision. Timestamps are not second since epoch (and much less milliseconds from epochs). In my experience it creates hidden problems (or one should implement Google "slow time near leap seconds). Much better to define own starting point and counting milliseconds (as it is used scientifically, often based on GPS time). Unix epoch is for humans, GPS (or other timestamps) for real time. – Giacomo Catenazzi Oct 26 '20 at 07:17
  • @GiacomoCatenazzi: ok that might be slightly confusing to newcomers ^^ Btw. if you're talking leap smear and GPS time, shouldn't it be *accuracy* instead of precision? I mean, even if your datatype can precisely represent a certain time (or timedelta), it can be inaccurate (e.g. systematically wrong). – FObersteiner Oct 26 '20 at 07:29
  • @MrFuppes: it depends on data (e.g. stock exchange trade it is ok). but time delta could be wrong if the interval include leap second, and during leap second 1000 timestamps should have the same value (so a epoch timestamp will not represent exactly a certain time). The errors will come in strange places: "not unique index" (which for performance it is not checked by default by pandas), or just strange statistical results (outliers). – Giacomo Catenazzi Oct 26 '20 at 08:12

1 Answers1

0

Data;

df=pd.DataFrame({'UNIX':['1349720105','1546300800']})

Conversion

df['UNIX']=pd.to_datetime(df['UNIX'], unit='s')
wwnde
  • 26,119
  • 6
  • 18
  • 32