2

How can I create a plot with ggplot when my answers are TRUE or FALSE?

This is my code:

t.obese<-master1%>%
  filter(Income>0,obese==TRUE)%>%
  select(Income,obese)

> head(t.obese)
  Income obese
1  21600    TRUE
2   4000    TRUE
3  12720    TRUE
4  26772    TRUE

when I am trying to create a plot , r tells me " Don't know how to automatically pick scale for object of type haven_labelled/vctrs_vctr/double. Defaulting to continuous. Fehler: stat_count() can only have an x or y aesthetic."

Thank you!

> dput(t.obese[1:10, ])
structure(list(Income = structure(c(1944, 4000, 16000, 19200, 
22800, 21600, 18000, 18000, 2000, 18000), label = "Wages,Salary from                    main job", format.stata = "%42.0g", labels = c(`[-5] in Fragebogenversion    nicht enthalten` = -5, 
 `[-2] trifft nicht zu` = -2), class = c("haven_labelled",      "vctrs_vctr", 
 "double")), obese = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, 
TRUE, TRUE, TRUE)), row.names = c(NA, 10L), class = "data.frame")
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Lin
  • 73
  • 1
  • 7
  • show your plot code? – cory Dec 02 '20 at 16:22
  • What do you want to put on the x-axis? What do you want to put on the y-axis? What type of plot (bar, line, point...)? Do you want to use color or anything else? – Gregor Thomas Dec 02 '20 at 16:24
  • As I am a very beginner I tried with `p<-ggplot(t.obese, aes(x=obese, y=Income) + geom_bar()` I would like to have a Boxplots, y= Income and x=obese – Lin Dec 02 '20 at 16:27
  • But if it is a line, point or histogram any would be fine.. my main problem is the issue with the TRUE and FALSE – Lin Dec 02 '20 at 16:30
  • Do you want to show how the income differs across obese and not obese? – Lefkios Paikousis Dec 02 '20 at 16:39
  • @LefkiosPaikousis yes!! – Lin Dec 02 '20 at 16:50
  • Could you share some data with `dput()`? With *"haven_labelled/vctrs_vctr/double"* being referenced, seems like there's a little more going on than just TRUE/FALSE values. Posting the results of `dput(t.obese[1:10, ])` would help a lot. – Gregor Thomas Dec 02 '20 at 16:50
  • Though it may be as simple as a class conversion, you could try `ggplot(t.obese, aes(x=factor(obese), y=Income) + geom_bar()` – Gregor Thomas Dec 02 '20 at 16:51
  • @GregorThomas `structure(list(Income = structure(c(1944, 4000, 16000, 19200, 22800, 21600, 18000, 18000, 2000, 18000), label = "Wages,Salary from main job", format.stata = "%42.0g", labels = c(`[-5] in Fragebogenversion nicht enthalten` = -5, `[-2] trifft nicht zu` = -2), class = c("haven_labelled", "vctrs_vctr", "double")), obese = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE)), row.names = c(NA, 10L), class = "data.frame") > ` – Lin Dec 02 '20 at 16:53
  • @GregorThomas i tried this but still its telling me the same error ... "ggplot(t.obese, aes(x=factor(obese), y=Income) + geom_bar()" – Lin Dec 02 '20 at 16:56
  • Could you edit the `dput()` output into your question, in a code block? The comment formatting eats some of the important parts. – Gregor Thomas Dec 02 '20 at 17:03
  • 1
    @GregorThomas i edited it ! – Lin Dec 02 '20 at 17:13

2 Answers2

2

With the data you shared, which is minimal, tried this:

library(ggplot2)
#Code1
ggplot(as.data.frame(t.obese), aes(x=factor(obese), y=Income)) +
  geom_bar(stat='identity')+
  xlab('Obese')+
  scale_y_continuous(labels = scales::comma)

Output:

enter image description here

And this:

#Code 2
ggplot(as.data.frame(t.obese), aes(x=factor(obese), y=Income)) +
  geom_point()+
  geom_jitter()+
  geom_boxplot()+
  xlab('Obese')

Output:

enter image description here

Duck
  • 39,058
  • 13
  • 42
  • 84
2

If you want to compare Income distribution across obesity, then you need both obese = TRUE and obese = FALSE, so you can do the comparison

I randomly created an non_obese dataset just to do the comparison. Also, I removed the haven_labelled class for the Income since it was causing some issues in the reprex rendering [using haven::zap_labels()

Anyway, hope the following will help you get started

library(dplyr)
library(ggplot2)
library(haven)

obese <- 
structure(list(Income = structure(c(1944, 4000, 16000, 19200, 
                                    22800, 21600, 18000, 18000, 2000, 18000), 
                                  label = "Wages,Salary from main job", 
                                  format.stata = "%42.0g", 
                                  labels = c(`[-5] in Fragebogenversion nicht enthalten` = -5,
                                             `[-2] trifft nicht zu` = -2), 
                                  class = c("haven_labelled", "vctrs_vctr","double")), 
               obese = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,TRUE, TRUE, TRUE)), 
          row.names = c(NA, 10L), class = "data.frame"
          )


# remove the haven/labelled class of the income variable
obese <- 
  obese %>% 
  haven::zap_labels() 

non_obese <- 
  obese %>% 
  mutate(
    Income = Income - rnorm(1, mean = 1000, sd = 50),
    obese  = !obese
  )



full_data <- 
  bind_rows(obese, non_obese)


# Box plot 
full_data %>% 
  ggplot(
    aes(obese, Income)
  )+
  geom_boxplot(width = 0.5)+
  geom_point(position = position_jitter(width  = 0.05))

# Density plot
full_data %>% 
  ggplot(
    aes(Income,fill = obese)
  )+
  geom_density(alpha = 0.5)

Created on 2020-12-03 by the reprex package (v0.3.0)

Lefkios Paikousis
  • 462
  • 1
  • 6
  • 12