How to retrieve label and value information for data in dataframe A from dataframe B

Question

Edited (in order to be more precise): I have a csv file in which the label and value information is stored for each variable in dataframe A.
In the original dataframe A there are 250 variables. For demonstration purposes I have:

Reduced dataframe A from 250 to 7 variables.
Loaded the csv file with the information as a dataframe B and also reduced to 7 variables.

My specific question: How can I assign the information (e.g. label and value) in dataframe B per code to the variables in dataframe A. So far I can achieve my goal by doing this variable by variable.

I hope the question is now more specific. I have no idea whether I am completly wrong with my way of thinking. Would be grateful for any help.

my dataframe A:

structure(list(a = c(2L, 7L, 8L, 5L, 10L, 1L, 6L, 9L, 3L, 4L), 
               b = c("29.06.2016", "18.07.2016", "26.07.2016", "04.08.2016", "12.08.2016", "12.08.2016", "24.08.2016", "26.08.2016", "27.08.2016", "27.08.2016"), 
               c = c("A", "A", "B", "A", "C", "C", "B", "C","B", "C"), 
               d = c(4795L, 7242L, 2246L, 7914L, 9910L, 4279L,9174L, 8329L, 8310L, 4799L), 
               e = c(6L, 10L, 8L, 10L, 11L, 7L, 11L, 2L, 12L, 4L), 
f = c(1973L, 1933L, 1977L, 1969L, 1960L, 1950L, 1963L, 1967L, 1951L, 1970L), 
               g = c(2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L)), row.names = c(NA, -10L), class = "data.frame")

my dataframe B (info for labels and values):

structure(list(
  variable = c("a", "b", "c", "d", "e", "f", "g"), 
  class = c("number", "string", "string", "string", "number", "number", "number"), 
  label = c("AAAA", "BBBB", "CCCC", "DDDD", "EEEE", "FFFF", "GGGG"), 
  values = c("", 
             "", 
             "", 
             "", 
             "@0@,@k.A.@,@1@,@Januar@,@2@,@Februar@,@3@,@März@,@4@,@April@,@5@,@Mai@,@6@,@Juni@,@7@,@Juli@,@8@,@August@,@9@,@September@,@10@,@Oktober@,@11@,@November@,@12@,@Dezember@", 
             "", 
             "@0@,@k.A.@,@1@,@female@,@2@,@male@")), 
  row.names = c(NA, -7L), class = "data.frame")

desired output:

nyk · Accepted Answer · 2021-01-24T09:54:49.320

1

You can try the followings. However, it would be easier if you read in e and g as factors before hand. Then, you won't need to convert them using as.factor.

library(lubridate)

dfB$label
dfC <- setNames(dfA, dfB$label)

# use a random date to generate level
a <- month(ymd(210101) + months(0:11), label = TRUE)

dfC$EEEE <- as.factor(dfC$EEEE) 
levels(dfC$EEEE) <- a


dfC$GGGG <- as.factor(dfC$GGGG)
levels(dfC$GGGG) <- c("female", "male")

edited Jan 24 '21 at 09:54

answered Jan 24 '21 at 09:43

nyk

670
5
11

Thank you. This solution works and I was aware of such a solution. Nevertheless I thank you for your efforts. The original data has over 250 variables and I tried to do it automatically for each variable. A kind of lookup table etc.... I vote up. It is a working solution. – TarJae Jan 24 '21 at 10:17

How to retrieve label and value information for data in dataframe A from dataframe B

1 Answers1