I have a large data with many groups that looks like this. I want in each group to use the fruit with the most counts as the central fruit, and aggregate the other fruits based on it!
library(tidyverse)
df <- tibble(col1 = c("apple","apple","pple", "banana", "banana","bananna"),
col2 = c("pple","app","app", "bananna", "banan", "banan"),
counts_col1 = c(100,100,2,200,200,2),
counts_col2 = c(2,50,50,2,20,20),
id=c(1,1,1,2,2,2))
df
#> # A tibble: 6 × 5
#> col1 col2 counts_col1 counts_col2 id
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 apple pple 100 2 1
#> 2 apple app 100 50 1
#> 3 pple app 2 50 1
#> 4 banana bananna 200 2 2
#> 5 banana banan 200 20 2
#> 6 bananna banan 2 20 2
Created on 2022-03-16 by the reprex package (v2.0.1)
I want my data frame to look like this
id central_fruit fruits counts sum_counts
1 apple apple,pple,app 100,50,2 152
2 banana banana,bananna,banan 200,20,2 222
The format of the output it does not have to be like this. This is just an example. It can be a list of characters or just characters. Any help or guidance is appreciated