1

I have the following dataframe. And I need to replace unique values above 0 with numbers starting with 1 within each group? I understand how to replace values within a column or within a row but not within a whole group.

df <- data.frame(
  group = c("A","A","B","B","C"),
  value1 = c(55, 0, 92, 93, 173),
  value2 = c(55, 0, 92, 93, 174),
  value3 = c(66, 77, 92, 0, 175)
)

print(df)

  group value1 value2 value3
1     A     55     55     66
2     A      0      0     77
3     B     92     92     92
4     B     93     93      0
5     C    173    174    175

And here is what I need to get:

  group value1 value2 value3
1     A      1      1      2
2     A      0      0      3
3     B      1      1      1
4     B      2      2      0
5     C      1      2      3

I tried to do it using mutate, across and dense_rank functons.

df <- df %>%
   group_by(group) %>%
   mutate(across(-group, ~ifelse(. > 0, dense_rank(.), .))) %>%
   ungroup()

But got an error:

Error: Problem with `mutate()` input `..1`.
x Can't subset columns that don't exist.
x Column `group` doesn't exist.
i Input `..1` is `across(-group, ~ifelse(. > 0, dense_rank(.), .))`.
i The error occurred in group 1: group = "A".

2 Answers2

0

As @GuedesBF mentioned, dense_rank() with ifelse() would make the variable without 1.
So I figured this out with pivot_longer() and pivot_wider().

First, pivoting your df with pivot_longer() as follwing.

df_long <- df %>%
  pivot_longer(cols = -group)

# A tibble: 15 × 3
   group name   value
   <chr> <chr>  <dbl>
 1 A     value1    55
 2 A     value2    55
 3 A     value3    66
 4 A     value1     0
 5 A     value2     0
 6 A     value3    77
 7 B     value1    92
 8 B     value2    92
 9 B     value3    92
10 B     value1    93
11 B     value2    93
12 B     value3     0
13 C     value1   173
14 C     value2   174
15 C     value3   175

Next, split the data whether value is 0. When value is equal to 0, do dense_rank(), others (value is 0).

df_long_rank <- df_long %>%
    filter(value > 0) %>% 
    group_by(group) %>%
    mutate(value = dense_rank(value))

df_long_zero <- df_long %>%
    filter(value==0)

After that, binding the rank data and zero data.

df_long_merge <- bind_rows(df_long_rank, df_long_zero) %>%
  arrange(group) %>%
  ungroup()

Finally, after arrange the data, do pivot_wider() and unnest().

df_long_merge %>%
  pivot_wider(values_fn = list) %>%
  unnest(cols=value1:value3) 

output

# A tibble: 5 × 4
  group value1 value2 value3
  <chr>  <dbl>  <dbl>  <dbl>
1 A          1      1      2
2 A          0      0      3
3 B          1      1      1
4 B          2      2      0
5 C          1      2      3

All codes

df_long <- df %>% pivot_longer(cols = -group)

df_long_rank <- df_long %>%
  filter(value > 0) %>% 
  group_by(group) %>%
  mutate(value = dense_rank(value))

df_long_zero <- df_long %>%
  filter(value==0)

bind_rows(df_long_rank, df_long_zero) %>%
  arrange(group) %>%
  ungroup() %>%
  pivot_wider(values_fn = list) %>%
  unnest(cols=value1:value3)
YH Jang
  • 1,306
  • 5
  • 15
0

A pure-dplyr solution:

library(dplyr)

df %>%
  mutate({
    dat <- as.data.frame(pick(value1:value3))
    dat[dat != 0] <- dense_rank(dat[dat != 0])
    dat
  }, .by = group)

#   group value1 value2 value3
# 1     A      1      1      2
# 2     A      0      0      3
# 3     B      1      1      1
# 4     B      2      2      0
# 5     C      1      2      3
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51