So I have data which looks like this:
Observed WRF
2014-06-28 12:00:00 0.000000 1.823554
2014-06-28 13:00:00 0.000000 1.001567
2014-06-28 14:00:00 0.000000 0.309840
2014-06-28 15:00:00 0.000000 0.889811
2014-06-28 16:00:00 0.000000 0.939780
2014-06-28 17:00:00 1.251794 1.271781
2014-06-28 18:00:00 1.610596 0.935092
2014-06-28 19:00:00 2.129068 0.868775
2014-06-28 20:00:00 2.326501 0.892550
...
2014-08-31 05:00:00 0.365868 2.463277
2014-08-31 06:00:00 0.281729 1.233760
2014-08-31 07:00:00 0.197590 0.427411
2014-08-31 08:00:00 0.127754 0.299558
2014-08-31 09:00:00 0.000000 0.571106
2014-08-31 10:00:00 0.000000 0.449634
2014-08-31 11:00:00 0.000000 0.324269
2014-08-31 12:00:00 0.000000 1.725650
and I wish to produce a graph with two sets of different colored boxplots on it. Now, I'm not super great at plotting boxplots to begin with, so my technique may be failing me. I have produce the following code:
df7.boxplot(by='day',whis=[10,90],sym=' ',figsize=(16,8),color=((1,0.502,0),'black'))\
.legend(loc='lower center', bbox_to_anchor=(1.007, -0.06),prop={'size':16})
plt.subplots_adjust(left=.1, right=0.9, top=0.9, bottom=.2)
plt.title('Five Day WRF Model Comparison Near %.2f,%.2f' %(lat,lon),fontsize=24)
plt.ylabel('Hourly Wind Speed [$W/m^2$]',fontsize=18,color='black')
ax7=plt.gca()
ax7.xaxis.set_label_coords(0.5, -0.05)
plt.xlabel('Time',fontsize=18,color='black')
plt.show()
Which then gives me:
File "<ipython-input-35-9945f2efb84e>", line 1, in <module>
df7.boxplot(by='D',whis=[10,90],sym=' ',figsize=(16,8),color=((1,0.502,0),'black')).legend(loc='lower center', bbox_to_anchor=(1.007, -0.06),prop={'size':16})
File "...\Anaconda2\lib\site-packages\pandas\core\frame.py", line 5581, in boxplot
return_type=return_type, **kwds)
File "...\Anaconda2\lib\site-packages\pandas\tools\plotting.py", line 2747, in boxplot
return_type=return_type)
File "...\Anaconda2\lib\site-packages\pandas\tools\plotting.py", line 3139, in _grouped_plot_by_column
grouped = data.groupby(by)
File "...\Anaconda2\lib\site-packages\pandas\core\generic.py", line 3778, in groupby
**kwargs)
File "...\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 1427, in groupby
return klass(obj, by, **kwds)
File "...\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 354, in __init__
mutated=self.mutated)
File "...\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 2383, in _get_grouper
in_axis, name, gpr = True, gpr, obj[gpr]
File "...\Anaconda2\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
return self._getitem_column(key)
File "...\Anaconda2\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
return self._get_item_cache(key)
File "...\Anaconda2\lib\site-packages\pandas\core\generic.py", line 1350, in _get_item_cache
values = self._data.get(item)
File "...\Anaconda2\lib\site-packages\pandas\core\internals.py", line 3290, in get
loc = self.items.get_loc(item)
File "...\Anaconda2\lib\site-packages\pandas\indexes\base.py", line 1947, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)
File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)
File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)
File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)
KeyError: 'D'
I would like the boxplots to be sorted by day or month and each be plotted on the same graph with two different colors, that is, one orange and the other black to basically be overlaid so that one can discern the differences between the two. If this is not possible without it looking like a big mess, then plotted on two different graphs, being subplots on one figure (I can do that.) However, the sorting by seems to be screwing up. I can't figure out why my date time index is not able to sort it by day or by 7 day. I have also tried
df7.boxplot(by=df07.index.day,whis=[10,90],sym=' ',figsize=(16,8),color=((1,0.502,0),'black'))\
.legend(loc='lower center', bbox_to_anchor=(1.007, -0.06),prop={'size':16})
...
which then gives me:
AssertionError: Grouper and axis must be same length
I am not sure what is going on, but it seems to be not recognizing the datetimeIndex
even though when I do df7.info()
, I return:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1542 entries, 2014-06-28 12:00:00 to 2014-08-31 12:00:00
Data columns (total 2 columns):
Observed 1542 non-null float64
WRF 1542 non-null float64
dtypes: float64(2)
memory usage: 36.1 KB
So it seems to be in datetimeIndex
format.
Any and all help is appreciated and if there is further clarification that needs to be done, I am more than happy to lend extra information.