1

I have the following data frame d:

             TS Turbidity
1 2014-12-12 00:00:00        87
2 2014-12-12 00:15:00        87
3 2014-12-12 00:30:00        91
4 2014-12-12 00:45:00        84
5 2014-12-12 01:00:00        92
6 2014-12-12 01:15:00        89

TS is my time combining the year, month, day, hour, minutes, and second. When I look at the nature of my data, TS is:

$ TS       : POSIXct, format: "2014-12-12 00:00:00" "2014-12-12   00:15:00" 

So for me , R understand that TS is date format.

I want to create boxplot per month (I precise that I have several years of data). I create a new column Month as follow:

d$Month<-as.Date(cut(d$TS, breaks="month"))

Then I plot this function:

ggplot(d, aes(x = factor(Month), y = Turbidity))+ geom_boxplot() + theme_bw()

This function plots well my data but I have too many x-labels and would like to plot labels for every 4 months for example. I add scale_x_date:

ggplot(d, aes(x = factor(Month), y = Turbidity))+ geom_boxplot() + theme_bw() + 
scale_x_date(date_breaks = "4 month", date_labels = "%B")

It is at this step that I have trouble. I got this error message :

" Error: Invalid input: date_trans works with objects of class Date only". 

But R precise that Month is in a date format.

$ Month    : Date, format: "2014-12-01" "2014-12-01" "2014-12-01"

I look at forums but I cannot figure out where is the problem because for me I have already state that Month was a date.

Thanks for any help!

Elia
  • 17
  • 5

2 Answers2

2

One approach could be as(with modified data):

library(ggplot2)
library(lubridate)

df %>% mutate(TS = ymd_hms(TS)) %>%
  ggplot(aes(x = cut(TS, breaks="quarter"), y = Turbidity)) +    
  geom_boxplot() + 
  labs(x = "Start date of Quarter") +
  theme_bw()

enter image description here

Data : Different from OP

df <- read.table(text = 
"TS                           Turbidity
'2014-09-12 00:00:00'        87
'2014-09-12 00:15:00'        107
'2014-10-12 00:30:00'        91
'2014-10-12 00:30:00'        50
'2014-11-12 00:45:00'        84
'2014-11-12 00:45:00'        60
'2014-12-12 01:00:00'        92
'2014-12-12 01:15:00'        60
'2015-01-12 00:00:00'        87
'2015-01-12 00:15:00'        107
'2015-02-12 00:30:00'        91
'2015-02-12 00:30:00'        50
'2015-03-12 00:45:00'        84
'2015-03-12 00:45:00'        60
'2015-04-12 01:00:00'        92
'2015-04-12 01:15:00'        60
'2015-05-12 00:00:00'        87
'2015-05-12 00:15:00'        107
'2015-06-12 00:30:00'        91
'2015-06-12 00:30:00'        50
'2015-07-12 00:45:00'        84
'2015-07-12 00:45:00'        60
'2015-08-12 01:00:00'        92
'2015-08-12 01:15:00'        60", header = TRUE, stringsAsFactors = FALSE)
MKR
  • 19,739
  • 4
  • 23
  • 33
  • Thank for your answer . Previously I obtain the same type of graph but I have too many x-labels (Dec-2014, Nov-2014,etc.). How can I do that labels appears every 4 months (Dec-2014, March-2015, etc.)? – Elia Apr 02 '18 at 17:05
  • Yes. Even that can be done. I wanted to show you a simple working example. The best way should be to `cut` and divide. What is full range of your date? I might give you an answer after couple of hours though. – MKR Apr 02 '18 at 17:09
  • ok thanks! My full range of date is from: (2014-12-12 00:00:00) to (2018-03-19 09:15:00). – Elia Apr 02 '18 at 17:14
  • @Elia Have a look at updated answer. I have used a modified `data`. The `box-plot` is drawn for every quarter. Hope it helps. – MKR Apr 02 '18 at 21:59
0

In your call to ggplot you explicitly convert Month to a factor with aes(x = factor(Month)) internally. Try removing the factor() wrapper from Month.

This doesn't change the object outside of ggplot, which is why you still see that it's class is Date when you check it. But you are definitely converting the class from Date to Factor inside of ggplot here.

apax
  • 160
  • 7
  • I tried that but when I did that my figure was an unique boxplot for the entire period. – Elia Apr 02 '18 at 16:08