5

I have a data-frame of dates (Date object); see bottom. I'm trying to convert them to day-of-week and then draw a histogram, but ideally where the labels are 'Monday'...'Sunday' (not numeric)

I have two distinct problems:

  1. It's easy to convert a Date object to day-of-week, but the result is string or numeric, not an object.
  2. When I get a histogram, the bins and labels are wrong (see below).

If I use weekdays(dat), the output is string ("Monday"...) which cannot be used in hist().

Alternatively, if I convert to numeric data, how to get string labels on hist()?

> dotw <- with( month.day.year(dat[,1]), day.of.week(month,day,year) )
> hist(xxx,labels=c('M','Tu','W','Th','F','Sa','Su'),col='black') # WTF?!
> hist(dotw,xlab=list('M','Tu','W','Th','F','Sa','Su'))

Does not work as intended for labeling. What's with the 0.5-width bins? And also, how to prevent the lack of gap between Sunday->0 and Monday->1? Ideally, no gaps between columns.

My data looks like:

> dat
  [1] "2010-04-02" "2010-04-06" "2010-04-09" "2010-04-10" "2010-04-14" "2010-04-15" "2010-04-19"
  [8] "2010-04-21" "2010-04-22" "2010-04-23" "2010-04-26" "2010-04-28" "2010-04-29" "2010-04-30"
 ...

> str(dat)
 Date[1:146], format: "2010-04-02" "2010-04-06" "2010-04-09" "2010-04-10" "2010-04-14" "2010-04-15" ...

> str(weekdays(dat))
 chr [1:146] "Friday" "Tuesday" "Friday" "Saturday" "Wednesday" "Thursday" "Monday" ...
> hist(weekdays(dat))
Error in hist.default(weekdays(dat)) : 'x' must be numeric
smci
  • 32,567
  • 20
  • 113
  • 146

3 Answers3

7
dat <- as.Date( c("2010-04-02", "2010-04-06", "2010-04-09", "2010-04-10", "2010-04-14", 
       "2010-04-15", "2010-04-19",   "2010-04-21", "2010-04-22", "2010-04-23","2010-04-24", 
        "2010-04-25", "2010-04-26", "2010-04-28", "2010-04-29", "2010-04-30"))
 dwka <- format(dat , "%a")
 dwka
# [1] "Fri" "Tue" "Fri" "Sat" "Wed" "Thu" "Mon"
#  [8] "Wed" "Thu" "Fri" "Sat" "Sun" "Mon" "Wed"
# [15] "Thu" "Fri"
dwkn <- as.numeric( format(dat , "%w") ) # numeric version
hist( dwkn , breaks= -.5+0:7, labels= unique(dwka[order(dwkn)]))

enter image description here

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Beautiful, thank you! So good it should be a builtin! (I would not have expected to use breaks at 0.5 for integer data, that really should be builtin and prevent the bogus half-width bars.) – smci Aug 03 '11 at 19:42
4

I suspect you want a barplot rather than a histogram. You can use table to count the days.

barplot(table(weekdays(dat)))

Note that by default the days will be sorted alphabetically, so to order it more naturally you will have to reorder the levels in a factor call:

barplot(table(factor(weekdays(dat),levels=c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"))))
James
  • 65,548
  • 14
  • 155
  • 193
3

Convert your weekdays(dat) to a factor (data type for categorical variables), and unclass it (which will convert to integer) for the histogram. There are operations on the factor class which makes it easy to create the custom x-axis.

## days of the week
days <- c('Sun','Mon','Tues','Wed','Thurs','Fri','Sat')

## sample with replacement to generate data for this example
samples <- sample(days,100,replace=TRUE)

## convert to factor
## specify levels to specify the order
samples <- factor(samples,levels=days)

hist(unclass(samples),xaxt="n")
axis(1,at=1:nlevels(samples),lab=levels(samples))
box()
hatmatrix
  • 42,883
  • 45
  • 137
  • 231
  • Ok thanks. In my approach why do I get the 0.5-width bins, the lack of gap between 0('Sunday')&1('Monday'), and the mismatched-width-0.5 labeling from `hist(labels=c('M','Tu','W','Th','F','Sa','Su'))`? – smci Aug 03 '11 at 08:03
  • There is a width argument to `hist` to control the bin widths; for more complete control over the axis appearance I would set `xaxt="n"` in `hist` and draw my own with `axis`. – hatmatrix Aug 03 '11 at 08:18
  • Note that hist is a generic function and does different things depending on the class of the first argument that you provide. So depends on what `xxx` and `dotw` in your example are. – hatmatrix Aug 03 '11 at 08:28
  • What I wrote in the question: _dat_ is a data-frame of Date objects. _dotw_ is an integer (0..6) computed by the code shown. – smci Aug 03 '11 at 08:37
  • This was also very helpful, thanks. Sorry could only pick one. – smci Nov 01 '11 at 02:31