0

enter image description here

UPDATE. I GOT THE GRAPH WITH THE HELP OF @MrFuppes FROM THIS COMMUNITY. BUT IS THERE ANY WAY WITH WHICH I CAN DISPLAY MY DATES IN AN IDENTIFIABLE FORM TO THE USER; INSTEAD OF 1.565B, 1.57B? :)

I tried to convert to float value in :- P.S all_data_1 is my dataframe. [Dates inside my Excel files are in the format 04.03.2020].

all_data_1['Datum'] = pd.to_datetime(all_data_1['Datum'],dayfirst=True, format='%d.%m.%Y')

but then, when I am trying to plot "lowess" trendline

fig = px.scatter(all_data_1, x="Datum", y="Et", color="Proben-Nr.", trendline="lowess") 
fig.show()

error message is coming as

TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]

Is there any method to convert Date in to Float inside a Dataframe

or

What is the solution of this error in this case i.e while using "pandas.to_datetime."

code

import glob
import shutil
import pandas as pd
import plotly.express as px
import xlrd
import matplotlib.pyplot as plt
%matplotlib inline
import time
from datetime import datetime
from pytz import timezone
import numpy as np
from numpy import diff
from sklearn import metrics
glob.glob("C:/Users/Dreamer/Desktop/*.xls")

# grab excel files only

pattern = "C:/Users/Dreamer/Desktop/*.xls"

# Save all file matches: xlsx_files

xls_files = glob.glob(pattern,recursive=True)
frames1 = []

for file in xls_files:
    df = pd.read_excel(file,'Ergebnisse')

    df1=df.loc[(df['Probenbezeichnung']=='A')]
    frames1.append(df1)

# Concatenate frames into dataframe

all_data_1 = pd.concat(frames1)

#all_data_1['Datum']=pd.to_datetime(all_data_1['Datum'],dayfirst=True,format='%d.#%m.%Y')  COMMENTED BECAUSE ITS NOT WORKING
all_data_1.sort_values(by=['Datum'],inplace=True, ascending=True)
all_data_1

# Find cells with no values

nan_df = all_data_1[all_data_1.isna().any(axis=1)]
display(nan_df.head())

all_data_1 = all_data_1.dropna(how='all')
all_data_1.head()
all_data_1 = all_data_1.dropna(how='any')
all_data_1.head()

#Saving to excel file

all_data_1.tail()
all_data_1.to_excel("C:/Users/Dreamer/Desktop/Data.xls");

#PLOT

fig = px.scatter(all_data_1, x="Datum", y="Et", color="Proben-Nr.", trendline="lowess")
fig.show()
Dreamer
  • 25
  • 2
  • 8
  • 1
    you should provide reproducible code . – snehil Jun 03 '20 at 23:49
  • the issue seems to be that `plotly.express.scatter` has an issue with the datetime column of your dataframe. maybe it helps to convert the column to int first, as shown [here](https://stackoverflow.com/a/40881958/10197418) – FObersteiner Jun 04 '20 at 07:18
  • i updated my code can you check? I just want to convert date to float within a data frame :) @MrFuppes – Dreamer Jun 04 '20 at 08:19

1 Answers1

0

plotly should accept datetime as string formatted '%Y-%m-%d', so this could work:

all_data_1['Datum'] = pd.to_datetime(all_data_1['Datum'], format='%d.%m.%Y').dt.strftime('%Y-%m-%d')
fig = px.scatter(all_data_1, x="Datum", y="Et", color="Proben-Nr.", trendline="lowess")
fig.show()

You could also try Unix time (seconds since the epoch), e.g. like

# this line should stay; convert from string to datetime,
# but you don't need dayfirst=True if you explicitly set a format
all_data_1['Datum'] = pd.to_datetime(all_data_1['Datum'], format='%d.#%m.%Y')

# now add a column with time in seconds as type float (or int; use np.int):
all_data_1['POSIX'] = all_data_1['Datum'].values.astype(np.float) / 10**9

# now call plot as
fig = px.scatter(all_data_1, x="POSIX", y="Et", color="Proben-Nr.", trendline="lowess")
fig.show()
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
  • Hello, Its working :). But the dates are coverted to a wierd number. Its in Billions. So, Is there any way by which a user can identify the dates from the graph? I had included the graph I got. – Dreamer Jun 04 '20 at 18:00
  • @Dreamer: ok seems like I misinterpreted what `plotly` expects. Added another option you can try. – FObersteiner Jun 04 '20 at 18:32
  • I tried it, This error is coming then ! ValueError: could not convert string to float: '02.08.2019' – Dreamer Jun 04 '20 at 19:03
  • @Dreamer: that strange, `.dt.strftime('%Y-%m-%d')` should convert that to '2019-08-02', which px.scatter should accept – FObersteiner Jun 04 '20 at 19:18
  • @ MrFuppes But by this we are converting Date to String right? – Dreamer Jun 04 '20 at 19:31
  • @Dreamer: sort-of; the line `pd.to_datetime(all_data_1['Datum'], format='%d.%m.%Y').dt.strftime('%Y-%m-%d')` converts string to datetime and back to string - because plotly's plots should be able to parse string correctly while POSIX is... well, seconds since 1970-01-01 and therefore not really human-readable. – FObersteiner Jun 04 '20 at 19:35
  • I tried again, it is not seem to be working. May be it will be an issue with this particular trendline"lowess" [Lowesstrendline in plotly](https://plotly.com/python/linear-fits/#fitting-multiple-lines-and-retrieving-the-model-parameters) . But it should be compatible with Plotly... hmmm! And also even when I am using POSIX, I am only getting trendline for one eventhough there are 6 probe numbers? – Dreamer Jun 04 '20 at 19:52