1

From a given dataframe:

# Create dataframe with 4 variables and 10 obs
set.seed(1)
df<-data.frame(replicate(4,sample(0:1,10,rep=TRUE)))

I would like to compute a substract operation between in all columns combinations by pairs, but only keeping one substact, i.e column A- column B but not column B-column A and so on.

What I got is very manual, and this tend to be not so easy when there are lots of variables.

# Result
df_result <- as.data.frame(list(df$X1-df$X2,
df$X1-df$X3,
df$X1-df$X4,

df$X2-df$X3,
df$X2-df$X4,

df$X3-df$X4))

Also the colname of the feature name should describe the operation i.e.(x1_x2) being x1-x2.

PeCaDe
  • 277
  • 1
  • 8
  • 33

1 Answers1

2

You can use combn:

COMBI = combn(colnames(df),2)
res = data.frame(apply(COMBI,2,function(i)df[,i[1]]-df[,i[2]]))
colnames(res) = apply(COMBI,2,paste0,collapse="minus")

head(res)
  X1minusX2 X1minusX3 X1minusX4 X2minusX3 X2minusX4 X3minusX4
1         0         0        -1         0        -1        -1
2         1         1         0         0        -1        -1
3         0         0         0         0         0         0
4         0         0        -1         0        -1        -1
5         1         1         1         0         0         0
6        -1         0         0         1         1         0
StupidWolf
  • 45,075
  • 17
  • 40
  • 72