I have a dataframe...
df <- tibble(
id = 1:10,
family = c("a","a","b","b","c", "d", "e", "f", "g", "h"),
col1_a = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
col1_b = c(1, 2, 3, 4, NA, NA, NA, NA, NA, NA),
col2_a = c(11, 12, 13, 14, 15, 16, 17, 18, 19, 20),
col2_b = c(11, 12, 13, 14, NA, NA, NA, NA, NA, NA),
)
Families will only contain 2 members at most (so they're either individuals or pairs).
For individuals (families with only one row, i.e. id = 5:10), I want to randomly move 50% of the data from columns ending in 'a' to columns ending in 'b'.
By the end, the data should look like the following (depending on which 50% of rows are used)...
df <- tibble(
id = 1:10,
family = c("a","a","b","b","c", "d", "e", "f", "g", "h"),
col1_a = c(1, 2, 3, 4, 5, NA, 7, NA, 9, NA),
col1_b = c(1, 2, 3, 4, NA, 6, NA, 8, NA, 10),
col2_a = c(11, 12, 13, 14, NA, NA, 17, 18, NA, 20),
col2_b = c(11, 12, 13, 14, 15, 16, NA, NA, 19, NA),
)
I would like to be able to do this with a combination of group_by and mutate since I am mostly using Tidyverse.
Update: I forgot to mention that values in columns ending 'a' should be replaced with NA if they are moved across to 'b'.