1

I have this dataframe that I'm trying to make a vertical line on an x-axis that is categorical.

data <- data.frame(
  condition = c('1', '1', '1', '1', '1', '2', '2', '2', '2', '2', '3', '3', '3', '3', '3'),
  AssessmentGrade = c('400', '410', '420', '430', '440', '500', '510', '520', '530', '540', 
                      '300', '310', '320', '330', '340'), 
  Freq = c('1', '2', '1', '5', '7', '9', '1', '5', '3', '4', '5', '8', '1', '3', '5'), 
  MathGrade = c('A+', 'B-', 'C-', 'D', 'F', 'A-', 'B', 'C+', 'D-', 'F', 'A+', 'D', 'D', 'F', 'C'), 
  Condition = c('Condition 1', 'Condition 1', 'Condition 1', 'Condition 1', 'Condition 1', 
                'Condition 2', 'Condition 2', 'Condition 2', 'Condition 2', 'Condition 2', 
                'Condition 3', 'Condition 3', 'Condition 3', 'Condition 3', 'Condition 3'))

I tried adding a field to make grade numeric and that helped

data$Gradenum <- as.numeric(data$MathGrade)

I used ggplot to get abubble graph but I was wondering how I would edit it to use my company's standard colors

p <- ggplot(data, aes(x = MathGrade, y = AssessmentGrade, size = Freq, fill = Condition)) +
 geom_point(aes(colour = Condition)) +
 ggtitle("Main Title") +
 labs(x = "First Math Grade", y = "Math Assessment Score")

How can I get a vertical line between C+ and D? I see a lot of information out there if your x axis is a date but not for other categorical values

Uwe
  • 41,420
  • 11
  • 90
  • 134
tangerine7199
  • 443
  • 2
  • 8
  • 24
  • @Miha IMHO, that's not a good dupe target as the linked question was asking for a vertical line at an x-position where the continuous data has a certain y-value. Here, the OP is asking to draw a vertical line for a categorial variable. – Uwe Jul 28 '17 at 15:09
  • @Walker Is it intended that the grades are ordered A-, A+, B, B-, C, C-, C+, ...? Shouldn't it read A+, A-, B, B-, C+, C, C-, ... instead? – Uwe Jul 28 '17 at 15:35

3 Answers3

2

Hardcoded solutions are error-prone

MrSnake's solution works - but only for the given data set because the value of 7.5 is hardcoded.

It will fail with just a minor change to the data, e.g., by replacing grade "A+" in row 1 of data by an "A".

Using the hardcoded xintercept of 7.5

p + geom_vline(xintercept = 7.5)

draws the line between grades C- and C+ instead of C+ and D:

enter image description here

This can be solved using ordered factors. But first note that the chart contains another flaw: The grades on the x-axis are ordered alphabetically

A, A-, A+, B, B-, C, C-, C+, D, D-, F

where I would have expected

A+, A, A-, B, B-, C+, C, C-, D, D-, F

Fixing the x-axis

This can be fixed by turning MathGrade into an ordered factor with levels in a given order:

grades <- c(as.vector(t(outer(LETTERS[1:4], c("+", "", "-"), paste0))), "F")
grades
 [1] "A+" "A"  "A-" "B+" "B"  "B-" "C+" "C"  "C-" "D+" "D"  "D-" "F"
data$MathGrade <- ordered(data$MathGrade, levels = grades)

factor()would be sufficient to plot a properly ordered x-axis but we need an ordered factor for the next step, the correct placement of the vertical line.

Programmatically placing the vertical line

Let's suppose that the vertical line should be drawn between grades C- and D+. However, it may happen that either or both grades are missing from the data. Missing factors won't be plotted. In the sample data set, there are no data with grade D+, so the vertical line should be plotted between grades C- and D.

So, we need to look for the lowest grade equal or greater D+ and the highest grade equal or less than C- in the data set:

upper <- as.character(min(data$MathGrade[data$MathGrade >= "D+"]))
lower <- as.character(max(data$MathGrade[data$MathGrade <= "C-"]))

These are the grades in the actual data set where the vertical line is to be plotted between:

xintercpt <- mean(which(levels(droplevels(data$MathGrade)) %in% c(lower, upper)))
p + geom_vline(xintercept = xintercpt)

enter image description here

Community
  • 1
  • 1
Uwe
  • 41,420
  • 11
  • 90
  • 134
1

Just add geom_vline ;)

p + geom_vline(xintercept = 7.5)

enter image description here

abichat
  • 2,317
  • 2
  • 21
  • 39
  • Can we say "holy over thinking batman" to me? thank you! once i made the numeric i thought i had to specifically say data$Gradenum=7.5 and i didn't know how to call that. so easy! thank you!!!!!!!!! – tangerine7199 Jul 28 '17 at 15:04
0

For changing the colors as to fit your company scheme, you can add something like:

  + scale_color_manual(values = c('Condition 1' = 'grey20', 
                                'Condition 2' = 'darkred', 
                                'Condition 3' = 'blue'))
Deena
  • 5,925
  • 6
  • 34
  • 40