0

I am trying to define a new variable that is the division of two other variables.

df$NewVariable <- df$OldVariable1 / (df$OldVariable2/100)

OldVariable2 contains NAs. I'd like NewVariable to return NAs, whenever OldVariable2 is NA. However, when I do summmary(df$NewVariable) after creating it, I get average and maximum of Inf. How can I tell R to produce NAs, so that my new NewVariable isn't affected by any Infs?

JotHa
  • 55
  • 7
  • 2
    Looks like at least one value in `df$OldVariable2` is `0`. – GKi Mar 09 '21 at 14:35
  • The default is that `NA`s are propagated by division as you are asking for. `Inf` values are created when you divide by 0 -- they don't have anything to do with `NA` values. – Gregor Thomas Mar 09 '21 at 14:39
  • As mentioned, one of the values in df$OldVariable2 is 0, in that case, what value do you want? NA or Inf? – Manu Mar 09 '21 at 14:45

1 Answers1

0

Try this:

df <- data.frame(OldVariable1 = c(2,3,4,4),
                 OldVariable2 = c(3,NA,0,1.5))
df$NewVariable <- ifelse(df$OldVariable2 == 0, NA, 
                         df$OldVariable1 / (df$OldVariable2/100))

Result:

> df
  OldVariable1 OldVariable2 NewVariable
1            2          3.0    66.66667
2            3           NA          NA
3            4          0.0          NA
4            4          1.5   266.66667

Thanks to @Gregor Thomas he pointed out that when you divide a number by NA, or NA/NA you'll get NA, so it wasn't necessary the first condition and you can simplify it.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Manu
  • 1,070
  • 10
  • 27
  • 1
    This won't solve the problem - division already propagates `NA` values. If OP is getting `Inf` values the problem is 0s in the denominator, not missing values. – Gregor Thomas Mar 09 '21 at 14:38
  • Thanks for pointing out the flaw of my answer @GregorThomas – Manu Mar 09 '21 at 14:43
  • 1
    Much improved! The `is.na(df$OldVariable2)` is redundant - `x / NA` will give `NA` for all `x` already. You can simplify to `ifelse(df$OldVariable2 == 0, NA, df$OldVariable1 / (df$OldVariable2/100))` – Gregor Thomas Mar 09 '21 at 14:46
  • Thank you @GregorThomas, I started typing in the console `3/NA` and `NA/NA` to see what I get... it's `NA`! – Manu Mar 09 '21 at 14:56