0

I have a dataframe that contains Date, Day-of-the-Week (categorical), and number of calls (numeric). I'm trying to do analytics on how what the distribution of call volume is by Day-of-the-week. Using the lattice package I was able to create a bar chart, but still need some additional help:

  1. How can I order/sort the Day-of-the-Week variables that appear in my barchart (I'd like it to start on Sat, then Sun, then Mon,...)?

  2. I'd also like to transpose the distribution so that the bar charts are vertical as opposed to horizontal.

  3. Finally, how can I add a box-and-wisker plot? Should I still use tapply for this?

Thanks!

Here's what I've done so far:

LatinoDRTVdata <- read.csv("//dishfs1/Marketing/Mktg_Analytics/Team Member folders/Ryan_Chase/Ad Hoc/Latino DRTV Normalized Calls.csv")

#look at the first 10 rows
head(LatinoDRTVdata)

#look at the full dataset
LatinoDRTVdata

#look at the column names
colnames(LatinoDRTVdata)

#check the class of the Normalized.Latino.DRTV.call.volume column
class(LatinoDRTVdata$Normalized.Latino.DRTV.call.volume)

##make  the call volume a numeric vector 
LatinoDRTVdata$Normalized.Latino.DRTV.call.volume <- as.numeric(LatinoDRTVdata$Normalized.Latino.DRTV.call.volume)

#now check the class again
class(LatinoDRTVdata$Normalized.Latino.DRTV.call.volume)
(LatinoDRTVdata$Normalized.Latino.DRTV.call.volume)

#194 calls is the mean volume regardless of the day
mean(LatinoDRTVdata$Normalized.Latino.DRTV.call.volume)

#Day of the week is a factor
class(LatinoDRTVdata$Day.of.the.Week)

summary(LatinoDRTVdata)

str(LatinoDRTVdata)

#histogram of daily Latino DRTV call volume
hist(LatinoDRTVdata$Normalized.Latino.DRTV.call.volume)

#find the mean of each day
Daily.Latino.DRTV.Distribution<- tapply(LatinoDRTVdata$Normalized.Latino.DRTV.call.volume,LatinoDRTVdata$Day.of.the.Week,mean)

Daily.Latino.DRTV.Distribution

Daily.Latino.DRTV.Distribution$ormalized.Latino.DRTV.call.volume

##check that a new object has been added
ls()

str(Daily.Latino.DRTV.Distribution)

#make sure you install the lattice package for the graphics
#load the lattice package
library(lattice)
barchart(Daily.Latino.DRTV.Distribution)

here is the top 10 rows of my data:

> head(LatinoDRTVdata)
      Date Day.of.the.Week Normalized.Latino.DRTV.call.volume
1 3/1/2013          Friday                                384
2 3/2/2013        Saturday                                277
3 3/3/2013          Sunday                                178
4 3/4/2013          Monday                                400
5 3/5/2013         Tuesday                                410
6 3/6/2013       Wednesday                                404
> 
Ryan Chase
  • 2,384
  • 4
  • 24
  • 33
  • basically you want a horizontal bar chart with day of the week ordered and a add a box & whisker plot on top of the bars? could you please paste some sample data in a dataframe called df or paste the dput of 10 rows of your data? – vagabond Feb 02 '15 at 20:14
  • I added the top 10 rows of my data. Thanks. – Ryan Chase Feb 02 '15 at 20:26

1 Answers1

0
  1. Set the DotW factor order by redeclaring it.

    levels(LatinoDRTVdata$Day.of.the.Week) <- c("Saturday", "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday")`
    
  2. Use the ggplot package to make horizontal bars.

    library(ggplot2)
    ggplot(LatinoDRTVdata, aes(x=Day.of.the.Week, y=Normalized.Latino.DRTV.call.volume)) + geom_bar() + coord_flip()
    

    It doesn't look like the data you're using to graph is summarized into counts, but if it is, you'll need to add an argument geom_bar(stat='identity'). The horizontal bar occurs by the coord_flip() function.

  3. A slight modification of 2.

    ggplot(LatinoDRTVdata, aes(x=Day.of.the.Week, y=Normalized.Latino.DRTV.call.volume)) + geom_boxplot()
    

A good resource for ggplot: Cookbook for R

Gary Chung
  • 158
  • 10
  • Hey Gary, Thanks for your reply! I am still a new R-user...can you explain in #1 how R knows to assign "Sat" with the values in the factor currently named "Saturday"? I'm getting an error saying: "Error in levels(LatinoDRTVdata$Day.ofthe.Week) <- c("Sat", "Sun", "Mon", : attempt to set an attribute on NULL" – ....also, can you tell me how to format your comment replys on StackExchange? – Ryan Chase Feb 02 '15 at 23:53
  • Hi Ryan, I updated the answer to help address your question (before I didn't know how your DotW factor was formatted, so I just guessed). By using the exact factor label "Saturday" as opposed to "Sat", you eliminate the ambiguity. Give it a shot. Make sure it is in fact a factor by using `is.factor(data$DotW)` to test, and if needed, `data$DotW <- as.factor(data$DotW)` to convert. To add code blocks to a comment, surround the code with backticks (the key above Tab). – Gary Chung Feb 03 '15 at 19:43