0

I've imported some SPSS data into R, the data is labelled and I want to plot the label names, rather than the values. I've tried the below, but it prints the values

library(labelled)
library(tidyverse)

cols <- c("White", "Red", "Black", "Green", "Brown", "Pink", "Orange")
letters <- c("D", "E", "F", "G", "H", "I", "J")

# add labels to the color values
tmp <- diamonds %>% 
 mutate(color = as.character(color)) %>%
 set_value_labels(color = setNames(letters, cols)) 

ggplot(tmp) + geom_bar(aes(x=print_labels(color)))

enter image description here

The graph still uses the color values as the y-axis, rather than the labels. How can we plot the dataframe labels?

pluke
  • 3,832
  • 5
  • 45
  • 68

2 Answers2

1

Based on cols and letters, I would create a dataframe for these and then merge (via dplyr::inner_join) with tmp to create a new column that has the names that you want along the x-axis.

library(ggplot2)
library(dplyr)
library(tibble)

#Create a dataframe key to get the color names for each category and make a new column with the names.
col_key <- attributes(tmp$color)$labels %>%
  as.data.frame() %>%
  dplyr::rename(cols = 1) %>%
  tibble::rownames_to_column("color")

tmp <- dplyr::inner_join(tmp,
                         col_key, by = "color")

ggplot(tmp) +
  geom_bar(aes(x = cols))
AndrewGB
  • 16,126
  • 5
  • 18
  • 49
  • My problem is that I'm using an SPSS data file with the labelling and values built in – pluke Jul 01 '21 at 17:25
  • Why not manipulate it once it is into R? You can set up a workflow for how you want the figure to look in R that could seamlessly work whenever you bring in an updated file from SPSS to R. – AndrewGB Jul 01 '21 at 17:50
  • I might have to do that, I just thought there must be a more straightforward method. I'm not sure why you have labels when you can't seem to access them – pluke Jul 01 '21 at 18:45
  • The reason I was able to plot them is that I didn't use the `haven` labelling. I just did an `inner_join` with the key data (i.e., cols and letters), then created a new column, then used that to plot. – AndrewGB Jul 01 '21 at 19:44
  • I just added a work around using the same methods, but extracting out the names from the haven labelling. This way when you read in your data from SPSS, then you can easily get the key and the color names. – AndrewGB Jul 01 '21 at 20:02
  • I think I've gotten it, in haven we have the generic as_factor(x) command, this can be used for non-haven, and haven labelled data. – pluke Jul 02 '21 at 12:31
  • I also made a mistake with the set_value_labels(color = setNames(letters, cols)) – pluke Jul 02 '21 at 12:32
1
library(haven)

cols <- c("White", "Red", "Black", "Green", "Brown", "Pink", "Orange")
letters <- c("D", "E", "F", "G", "H", "I", "J")

tmp <- diamonds %>% 
  mutate(color = as.character(color)) %>%
  set_value_labels(color = setNames(letters, cols)) 

ggplot(tmp) + geom_bar(aes(x=as_factor(color)))

Using the haven library, we can use the as_factor(x) command, which converts the labels into factors

enter image description here

pluke
  • 3,832
  • 5
  • 45
  • 68