I have a data frame that has percentage values for a number of variables and observations, as follows:
obs <- data.frame(Site = c("A", "B", "C"), X = c(11, 22, 33), Y = c(44, 55, 66), Z = c(77, 88, 99))
I need to prepare this data as an edge list for network analysis, with "Site" as the nodes and the remaining variables as the edges. The result should look like this:
Node1 Node2 Weight Type
A B 33 X
A C 44 X
...
B C 187 Z
So that for "Weight" we are calculating the sum of all possible pairs, and this separately for each column (which ends up in "Type").
I suppose the answer to this has to be using apply
on a combn
expression, like here Applying combn() function to data frame, but I haven't quite been able to work it out.
I can do this all by hand taking the combinations for "Site"
sites <- combn(obs$Site, 2)
Then the individual columns like so
combA <- combn(obs$A, 2, function(x) sum(x)
and binding those datasets together, but this obviously become annoying very soon.
I have tried to do all the variable columns in one go like this
b <- apply(newdf[, -1], 1, function(x){
sum(utils::combn(x, 2))
}
)
but there is something wrong with that. Can anyone help, please?