-1

I'm trying to apply a function over the rows of a data frame and return a value based on the value of each element in a column. I'd prefer to pass the whole dataframe instead of naming each variable as the actual code has many variables - this is a simple example.

I've tried purrr map_dbl and rowwise but can't get either to work. Any suggestions please?

#sample df
df <- data.frame(Y=c("A","B","B","A","B"),
                  X=c(1,5,8,23,31))

#required result
Res <- data.frame(Y=c("A","B","B","A","B"),
                  X=c(1,5,8,23,31),
                  NewVal=c(10,500,800,230,3100)
                  )

#use mutate and map or rowwise etc
Res <- df %>%
  mutate(NewVal=map_dbl(.x=.,.f=FnAdd(.)))

Res <- df %>%
  rowwise() %>% 
  mutate(NewVal=FnAdd(.))


#sample fn
FnAdd <- function(Data){

  if(Data$Y=="A"){
    X=Data$X*10
  }  

  if(Data$Y=="B"){
    X=Data$X*100
  } 
  return(X)
}
Zeus
  • 1,496
  • 2
  • 24
  • 53
  • I know, but this is just a simple example, there are about 20 similar functions each taking many variables. I'm looking for a clean way to do this. – Zeus Sep 26 '17 at 06:27
  • @Zeus Could you please provide an example that reproduces your problem – akrun Sep 26 '17 at 06:32
  • if you run either of the methods I tried you should get the same error as me – Zeus Sep 26 '17 at 06:35
  • @Zeus If you have 100 unique elements and the values to be multiplied are also kind of custom, then it is better to create the keyval dataset manually. The `ifelse` route may not work as there is a limitation for the number of nested ifelse and would be slow – akrun Sep 26 '17 at 06:47
  • my mistake, 100 rows to process, 2 unique values to decide on which calculation type to use. ``ifelse`` is good and simple – Zeus Sep 26 '17 at 06:55

2 Answers2

3

If there are multiple values, it is better to have a key/val dataset, join and then do the mulitiplication

keyVal <- data.frame(Y = c("A", "B"), NewVal = c(10, 100))
df %>%
   left_join(keyVal) %>%
   mutate(NewVal = X*NewVal)
#  Y  X NewVal
#1 A  1     10
#2 B  5    500
#3 B  8    800
#4 A 23    230
#5 B 31   3100

It is not clear how many unique values are there in the actual dataset 'Y' column. If we have only a few values, then case_when can be used

FnAdd <- function(Data){
   Data %>%
      mutate(NewVal = case_when(Y == "A" ~ X * 10,
                                Y == "B" ~ X *100,
                                TRUE ~ X)) 
}

FnAdd(df)
#   Y  X NewVal
#1 A  1     10
#2 B  5    500
#3 B  8    800
#4 A 23    230
#5 B 31   3100
akrun
  • 874,273
  • 37
  • 540
  • 662
  • This doesn't replicate ``Req`` in the question. – orizon Sep 26 '17 at 06:26
  • @orizon Can u check now? – akrun Sep 26 '17 at 06:29
  • Is there no way to build the if..else logic in the function? A different result/calculation depending on the value of a row element in one of the columns – Zeus Sep 26 '17 at 06:32
  • @Zeus Sure, but could you tell us how many unique elements are there in the 'Y' column? – akrun Sep 26 '17 at 06:34
  • there will be 2 unique elements. This is just a simple example to show the logic I need. I'd like to pass the whole df and reference the variables in the function, and change the function result depending on one of the elements – Zeus Sep 26 '17 at 06:36
  • @Zeus In that case, it is better to create a key/val dataset with unique keys and values corresponding to that – akrun Sep 26 '17 at 06:38
  • is there anyway to do something like this: ``Res <- df %>% mutate(NewVal=FnAdd(.)) #sample fn FnAdd <- function(Data){ NewVal = case_when(Data$Y == "A" ~ Data$X * 10, Data$Y == "B" ~ Data$X *100, TRUE ~ X) }`` – Zeus Sep 26 '17 at 06:47
  • I get an error ``Error in mutate_impl(.data, dots) : object 'Y' not found`` when I run your solution – Zeus Sep 26 '17 at 06:51
  • @Zeus I am using `dplyr_0.7.2`. Which version you have? – akrun Sep 26 '17 at 06:56
1

You were originally looking for a solution using dplyr's rowwise() function, so here is that solution. The nice thing about this approach is that you don't need to create a separate function.

Here's the version using if()

   df %>% 
   rowwise() %>% 
   mutate(NewVal = ifelse(Y == "A", X * 10,
                          ifelse(Y == "B", X * 100)))

and here's the version using case_when:

df %>% 
   rowwise() %>% 
   mutate(NewVal = case_when(Y == "A" ~ X * 10,
                             Y == "B" ~ X * 100))
hackR
  • 1,459
  • 17
  • 26