0

I am pretty new to R. I've been trying (for the entire day) to plot data points on one graph and generate a legend for it.

I have a raw data set about people's political ideology (factor, 1=liberal, 2=conservative, 3=neutral), firm choice (liberal or conservative), and logarithm value of firm wage difference (liberal firm wage - conservative firm wage).

What I tried to do is

  • Divide the people-tasks into quartiles according to the difference in log offered wages.
  • Divide each quartile by people's political ideology (3 types). This gave me 4 X 3 groups.
  • Plot 12 data points in one graph.

I wanted to

  • plot data points in one graph,
  • generate a legend, and
  • connect dots based on the political ideology, respectively.

So, I computed 12 data points (4 for liberal, 4 for neutral, 4 for conservative) and plot them using ggplot2. I used blue color for liberal, green color for neutral, and red color for conservative.

myggplot <- ggplot(, aes(x=c(-0.6, 0.55), y=c(0,1))) +
  geom_point(aes(x=-0.4384035, y=0.3995726),col = "blue",shape = 15, size = 3) +
  annotate("point", x=-0.221052, y=0.4463519, col="blue", shape=15, size=3)+
  annotate("point", x=0.0839785, y=0.9610656, col="blue", shape=15, size=3)+
  annotate("point", x=0.4146425, y=0.9598309, col="blue", shape=15, size=3)+
  annotate("point", x=-0.4384035, y=0.1650485, col="green", shape=17, size=3)+
  annotate("point", x=-0.221052, y=0.25, col="green", shape=17, size=3)+
  annotate("point", x=0.0839785, y=0.8275862, col="green", shape=17, size=3)+
  annotate("point", x=0.4146425, y=0.8152174, col="green", shape=17, size=3)+
  annotate("point", x=-0.4384035, y=0.06818182, col="red", shape=16, size=3)+
  annotate("point", x=-0.221052, y=0.08527132, col="red", shape=16, size=3)+
  annotate("point", x=0.0839785, y=0.6377953, col="red", shape=16, size=3)+
  annotate("point", x=0.4146425, y=0.7080292, col="red", shape=16, size=3)+
  scale_color_manual(name="Political ideology",
                     values=c("Liberal"="blue", "Neutral"="green", "Conservative"="red"),
                     labels=c("Liberal", "Neutral", "Conservative"),
                     guide="legend")+
  scale_shape_identity() +
  labs(y="Probability of Choosing a Liberal Firm", x="Log Wage (Liberal Firm - Conservative Firm)",
       title="The Effects of Log Wage Difference on Firm Choice") +
  theme(
    plot.title = element_text(size=10, hjust=0.5, face="bold"),
    axis.title.x = element_text(size=10),
    axis.title.y = element_text(size=10)
  )

However, I cannot generate a legend even after I tried several codes. This is what I can see:

ggplot result

I also tried to draw three lines, each connecting 4 dots (of the same color) respectively using the following code but failed.

myggplot + geom_line(mapping=aes(colour="blue", "green", "red", size=1))

It gives me an error message:

Error: Discrete value supplied to continuous scale
  • 4
    The crux of the issue is that your data should be in a data frame. (Mis)using `annotate()` this way means that there is no way for ggplot to determine the relationships between your data points. Also, data added by `annotate()`, by design, will not appear in any legends. – Ritchie Sacramento Mar 15 '22 at 23:49

2 Answers2

1

You can do something like this:


ggplot(df, aes(x,y, color=ideology, shape=ideology)) + 
  geom_point(size=3) + 
  labs(y="Probability of Choosing a Liberal Firm", x="Log Wage (Liberal Firm - Conservative Firm)",
       title="The Effects of Log Wage Difference on Firm Choice")+
  theme(
    plot.title = element_text(size=10, hjust=0.5, face="bold"),
    axis.title.x = element_text(size=10),
    axis.title.y = element_text(size=10),
    legend.position="bottom"
  ) + 
  geom_line()+
  scale_color_manual(name="Political ideology",
                     values=c("Liberal"="blue", "Neutral"="green", "Conservative"="red"),
                     labels=c("Liberal", "Neutral", "Conservative"),
                     guide="legend")+ 
  scale_shape_manual(name="Political ideology",
                     values=c("Liberal"=15, "Neutral"=17, "Conservative"=16),
                     labels=c("Liberal", "Neutral", "Conservative"),
                     guide="legend")

Note, that this requires your data to be in the frame df, which should look like this:

# A tibble: 12 x 3
         x      y ideology    
     <dbl>  <dbl> <chr>       
 1 -0.438  0.400  Liberal     
 2 -0.221  0.446  Liberal     
 3  0.0840 0.961  Liberal     
 4  0.415  0.960  Liberal     
 5 -0.438  0.165  Neutral     
 6 -0.221  0.25   Neutral     
 7  0.0840 0.828  Neutral     
 8  0.415  0.815  Neutral     
 9 -0.438  0.0682 Conservative
10 -0.221  0.0853 Conservative
11  0.0840 0.638  Conservative
12  0.415  0.708  Conservative

Input:

structure(list(x = c(-0.4384035, -0.221052, 0.0839785, 0.4146425, 
-0.4384035, -0.221052, 0.0839785, 0.4146425, -0.4384035, -0.221052, 
0.0839785, 0.4146425), y = c(0.3995726, 0.4463519, 0.9610656, 
0.9598309, 0.1650485, 0.25, 0.8275862, 0.8152174, 0.06818182, 
0.08527132, 0.6377953, 0.7080292), ideology = c("Liberal", "Liberal", 
"Liberal", "Liberal", "Neutral", "Neutral", "Neutral", "Neutral", 
"Conservative", "Conservative", "Conservative", "Conservative"
)), row.names = c(NA, -12L), spec = structure(list(cols = list(
    x = structure(list(), class = c("collector_double", "collector"
    )), y = structure(list(), class = c("collector_double", "collector"
    ))), default = structure(list(), class = c("collector_guess", 
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x0000022cac956e10>, class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"))

ideology

langtang
  • 22,248
  • 1
  • 12
  • 27
0

It can be done by modifying your code slightly, but as has already been mentioned it's not the best way.

legend_colors <- c("Liberal"="blue", "Neutral"="green", "Conservative"="red")
legend_shapes <- c("Liberal"=15, "Neutral"=17, "Conservative"=16)

myggplot <- ggplot(, aes(x=c(-0.6, 0.55), y=c(0,1))) +
  geom_point(aes(x=-0.4384035, y=0.3995726, col= "Liberal", shape="Liberal"), size = 3) +
  geom_point(aes(x=-0.221052, y=0.4463519, col="Liberal", shape="Liberal"), size=3) +
  geom_point(aes(x=0.0839785, y=0.9610656, col="Liberal", shape="Liberal"), size=3) +
  geom_point(aes(x=0.4146425, y=0.9598309, col="Liberal", shape="Liberal"), size=3) +
  geom_line(aes(x=c(-0.4384035,-0.221052,0.0839785,0.4146425),
                y=c(0.3995726,0.4463519,0.9610656,0.9598309),
                col = "Liberal")) +
  geom_point(aes(x=-0.4384035, y=0.1650485, col="Neutral", shape="Neutral"), size=3) +
  geom_point(aes(x=-0.221052, y=0.25, col="Neutral", shape="Neutral"), size=3) +
  geom_point(aes(x=0.0839785, y=0.8275862, col="Neutral", shape="Neutral"), size=3) +
  geom_point(aes(x=0.4146425, y=0.8152174, col="Neutral", shape="Neutral"), size=3) +
  geom_line(aes(x=c(-0.4384035,-0.221052,0.0839785,0.4146425),
                y=c(0.1650485,0.25,0.8275862,0.8152174),
                col = "Neutral")) +
  geom_point(aes(x=-0.4384035, y=0.06818182, col="Conservative", shape="Conservative"), size=3) +
  geom_point(aes(x=-0.221052, y=0.08527132, col="Conservative", shape="Conservative"), size=3) +
  geom_point(aes(x=0.0839785, y=0.6377953, col="Conservative", shape="Conservative"), size=3) +
  geom_point(aes(x=0.4146425, y=0.7080292, col="Conservative", shape="Conservative"), size=3) +
  geom_line(aes(x=c(-0.4384035,-0.221052,0.0839785,0.4146425),
                y=c(0.06818182,0.08527132,0.6377953,0.7080292),
                col = "Conservative")) +
  scale_color_manual(name="Political ideology",
                     values= legend_colors,
                     guide="legend") +
  scale_shape_manual(name="Political ideology",
                     values= legend_shapes,
                     guide="legend") +
  labs(y="Probability of Choosing a Liberal Firm", x="Log Wage (Liberal Firm - Conservative Firm)",
       title="The Effects of Log Wage Difference on Firm Choice") +
  theme(plot.title = element_text(size=10, hjust=0.5, face="bold"),
        axis.title.x = element_text(size=10),
        axis.title.y = element_text(size=10))


myggplot

enter image description here

85sph
  • 61
  • 4
  • Hello! I followed what you suggested. It worked very well! I learned how to plot data points without generating a data frame -- which is also very helpful! Thank you! – Youjeong Song Mar 16 '22 at 01:55