0

I'm trying to write some more sophisticated code. I have a problem which I have a solution for, but I would like to increase the flexibility of the code.

Input data (d):

ID  Height_cm   Height_m    Weight_kg   Weight_lb
1     180                        70 
2     165                                  120
3                  1.8           80 
4     100                                  60
5     190                                  200
6                  1.7           100

I want to transform height into cm and weight in to kg, all in one column like this:

ID  Height  Weight
1   180       70
2   165       54
3   180       80
4   100       45
5   190       90
6   170      100

I have a solution, but it's hard-coded:

library(tidyverse)
d$Height <- NA
d$Weight <-NA
d$Height <- ifelse(!is.na(d$Height_cm), d$Height_cm, d$Height_m * 0.01)
d$Weight <- ifelse(!is.na(d$Weight_kg), d$Weight_kg, d$Weight_lb * 0.45)

d <- d %>% select(ID, Height, Weight)

I want to be more sophisticated and take in an input file (below) and make the transformation based on that dataframe. The results will be the same, but it uses this transformation df to achieve the same thing:

Transformation_d:

marker      unit        new_col_name    transformation
Height_cm   centimetre  Height          1
Height_m    metre       Height          0.01
Weight_kg   kilogram    Weight          1
Weight_lb   pounds      Weight          0.45

This is where I'm stuck... I'd appreciate some guidance.

Go easy, I'm new to R!

lecb
  • 389
  • 1
  • 9

1 Answers1

0

Height_m should have transformation value as 100, I guess.

You can use cur_column() to get the column name to match with Transformation_d dataframe and get corresponding transformation value to multiply.

library(dplyr)

d %>%
  mutate(across(-1, ~. * Transformation_d$transformation[match(cur_column(), 
                         Transformation_d$marker)])) %>%
  transmute(ID, 
            Height = coalesce(Height_cm, Height_m), 
            Weight = coalesce(Weight_kg, Weight_lb))

#  ID Height Weight
#1  1    180     70
#2  2    165     54
#3  3    180     80
#4  4    100     27
#5  5    190     90
#6  6    170    100

A similar process in base R

cbind(d[1], 
      transform(sweep(d[-1], 2, 
        Transformation_d$transformation[match(names(d)[-1], 
        Transformation_d$marker)], `*`), 
        Height = ifelse(is.na(Height_cm), Height_m, Height_cm), 
        Weight = ifelse(is.na(Weight_kg), Weight_lb, Weight_kg)))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thank you, that’s really helpful. Is it possible to automate the transformed results into a new column of the value in Transformation_d$new_col_name, instead of having to hard code the column names? My dataset has hundreds of columns that need transforming, and automating which new column they go into would be preferable over coding each one individually... hope that makes sense. – lecb Jan 23 '21 at 14:53
  • @lecb This solution might help to do this dynamically https://stackoverflow.com/questions/54971918/dplyr-mutate-at-coalesce-dynamic-names-of-columns/54971985#54971985 – Ronak Shah Jan 24 '21 at 01:33