Create variable of unique combinations based on condition in R

Question

in the following dataframe

structure(list(model = c("A1", "A1", "B4", "B4", "B4", "A4", 
"A4", "A4", "G4", "G4"), category = c("X", "Y", "X", "Y", "Z", 
"X", "Y", "Z", "X", "Z"), sale = c(194L, 0L, 59L, 29L, 0L, 176L, 
88L, 0L, 87L, 44L)), class = "data.frame", row.names = c(NA, 
-10L))


   model category sale
1     A1        X  194
2     A1        Y    0
3     B4        X   59
4     B4        Y   29
5     B4        Z    0
6     A4        X  176
7     A4        Y   88
8     A4        Z    0
9     G4        X   87
10    G4        Z   44

category variable includes uniques: X, Y or Z. I need to create all the possible combinations of model and category variables; some of them already exists but for example, comb. of: A1 - Z is missing. Therefore, I need to complete the table with missing combinations.

The sale columns need to follow the given rule:

If a combination with Z is missing (e.g. A1-Z), sale is the same as model-Y (so A1-Y)
If a combination with Y is missing (e.g. A1-Y), sale is the same as model-X (so A1-X)

Expected output:

   model category sale
     A1        X  194
     A1        Y    0
     A1        Z    0
     B4        X   59
     B4        Y   29
     B4        Z    0
     A4        X  176
     A4        Y   88
     A4        Z    0
     G4        X   87
     G4        Z   44
     G4        Y   87

Merge with expand.grid(,) done on unique values. Pretty sure that worked examples are already available on SO. — IRTFM, Dec 10 '22 at 08:18

Chamkrai · Answer 1 · 2022-12-10T08:38:36.667

0

Kinda quick and dirty way to do it.

df %>% 
  complete(model, category) %>%  
  mutate(sale = if_else(is.na(sale), lag(sale), sale))

# A tibble: 12 × 3
   model category  sale
   <chr> <chr>    <int>
 1 A1    X          194
 2 A1    Y            0
 3 A1    Z            0
 4 A4    X          176
 5 A4    Y           88
 6 A4    Z            0
 7 B4    X           59
 8 B4    Y           29
 9 B4    Z            0
10 G4    X           87
11 G4    Y           87
12 G4    Z           44

edited Dec 10 '22 at 08:38

answered Dec 10 '22 at 08:33

Chamkrai

5,912
1
4
14

Could you explain this line: mutate(sale = if_else(is.na(sale), lag(sale), sale)). Especially, why we use lag(sale) – Mark Noble Dec 10 '22 at 08:48
NA is generated when you use `complete()`. To fill it with your condition; if there is NA, take the value above it such that Z takes Y if Z is NA/0. – Chamkrai Dec 10 '22 at 09:11
But what if there is random characters istead of X, Y Z? – Mark Noble Dec 10 '22 at 09:20
@MarkNoble Wouldnt work then probably, I just solved the question in a lazy way. – Chamkrai Dec 10 '22 at 09:27
Sorry, but I cant accept this answer. I need more general codeline – Mark Noble Dec 10 '22 at 09:28
Lol, its fine :) Good luck – Chamkrai Dec 10 '22 at 09:31

Create variable of unique combinations based on condition in R

1 Answers1