I want to plot time-series data using MatPlotLib. The data is stored in csv format which I process to Pandas DataFrame using pd.read_csv()
, which works fine. A data set comprises one time stamp column and around 10 value columns. I convert the time stamp (initially a string of format yyyy-MM-dd hh:mm:ss) via pd.to_datetime(dataFrame['TIMESTAMP'], format='%Y-%m-%d %H:%M:%S')
to datetime.
To plot the data I use the following code (generation of sample data is not part of my code):
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
N = 30
timestamps = pd.date_range('2020-01-16 8:00', periods=N, freq='72s')
# note: the original timestamps aren't evenly spaced, this is just data to test
dataFrame = pd.DataFrame({'TIMESTAMP': timestamps, 'Y1': np.random.normal(100, 30, N), 'Y2': np.random.normal(100, 30, N)})
acqFieldName = 'Y1'
fig = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75)
timestamps = dataFrame['TIMESTAMP'].dt.time
fig.axes.set_xticklabels(labels=timestamps, rotation=45)
plt.show()
Which results in the following:
Still, I would like to change the x-axis: the ticks are too dense, so I'd like to have - say - 10 ticks, and I'd like to see the time spend in minutes, format 'mm:ss'.
I tried the following:
fig = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75)
timestamps = dataFrame['TIMESTAMP'].dt.time
xmin = dataFrame['TIMESTAMP'][0]
xmax = dataFrame['TIMESTAMP'][len(dataFrame['TIMESTAMP']) - 1]
timeDiff: timedelta = xmax - xmin
customTicks = np.linspace(0., timeDiff.seconds, 10)
fig.axes.set_xticklabels(labels=customTicks, rotation=45)
fig.axes.set_xticks(customTicks)
plt.show()
Which results in the following:
obviously not what I want.
My problem would be solved if I could reduce the number of ticks formatted as time, or - better - if the points align with the ticks given as time spent.
Update: suggestion of Zaraki Kenpachi yields
fig, ax = plt.subplots()
ax.plot(dataFrame.set_index('TIMESTAMP'), dataFrame[acqFieldName])
plt.show()
Working solution based on JohanC's answer:
for fileName in glob.glob('*.csv'):
plt.close()
# NOTE: CsvFileProcessor is a custom class doing the readout of CSV and conversion to pandas.DataFrame
dataFrame, acqFieldName, settingParameterCount = CsvFileProcessor.processFile(fileName)
fig, ax = plt.subplots()
ax: plt.Subplot = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75, ax=ax)
startTime = dataFrame['TIMESTAMP'][0]
timeProgress = []
for timeStamp in dataFrame['TIMESTAMP']:
timePassed = timeStamp - startTime
timeProgress.append(timePassed)
custom_ticks = range(0, len(timeProgress), 5)
timestamps = [f"{datetime.timedelta(seconds=timeProgress[t].seconds)}" for t in custom_ticks]
# for manipulating the x-axis tick labels:
# https://stackoverflow.com/questions/51105648/ordering-and-formatting-dates-on-x-axis-in-seaborn-bar-plot
ax.axes.set_xticklabels(labels=timestamps, rotation=45)
ax.axes.set_xlabel(xlabel="Processing Time")
plt.title('Setting Parameters: ' + str(settingParameterCount))
ax.axes.set_xticks(custom_ticks)
outFileName = fileName.upper()
outFileName = outFileName.replace('_DATA.CSV', '')
outFileName = outFileName + '_READOUT.PNG'
fig.tight_layout()
#plt.savefig(outFileName)
plt.show()
results in: