2

Assuming I have a dataframe consisting of three columns

set.seed(24)
df1 <- data.frame(a=runif(10),b=runif(10),c=runif(10))

And want to have one with six columns of all interactions:

a*a, a*b, a*c, b*c, b*b, c*c

The solution I'm looking for should work for any number of columns, not just three

akrun
  • 874,273
  • 37
  • 540
  • 662
sheß
  • 484
  • 4
  • 20

3 Answers3

3

Let df be your data frame, try this:

formula <- ~ I(a^2) + I(b^2) + I(c^2) + a:b + a:c + b:c - 1
X <- model.matrix(formula, df)

Use -1 to drop intercept, i.e., all 1 column. Use I() to protect a^2.

It does not really matter whether you have 3-way interaction; model.matrix() can handle it pretty easily.

For you example data frame, you can get something like:

> X
       I(a^2)      I(b^2)    I(c^2)        a:b        a:c        b:c
1  0.02830988 0.290128663 0.8060044 0.09062841 0.15105592 0.48357521
2  0.78597627 0.451852115 0.1003373 0.59594047 0.28082514 0.21292636
3  0.36190629 0.117679147 0.5325122 0.20637060 0.43899829 0.25033093
4  0.83645938 0.006638227 0.9812959 0.07451582 0.90598796 0.08070976
5  0.50038157 0.197485843 0.6194279 0.31435374 0.55673179 0.34975454
6  0.25813071 0.567147970 0.5028665 0.38262032 0.36028502 0.53404096
7  0.51074360 0.219564943 0.1966824 0.33487518 0.31694526 0.20780897
8  0.37611759 0.752857721 0.3169607 0.53213065 0.34527451 0.48849390
9  0.00562814 0.627098114 0.8408894 0.05940872 0.06879421 0.72616812
10 0.78306385 0.405336110 0.3063323 0.56338624 0.48977313 0.35237413
attr(,"assign")
[1] 1 2 3 4 5 6

I did not set seed, so the numbers may be different when you test.

Model matrix is useful for constructing model matrix in regression analysis. In you case you only numerical data; in fact, you can also have factor-numeric interaction and factor-factor interaction.

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • thanks, however, this still requires manual fiddling if you suddenly have a third column – sheß May 25 '16 at 10:05
  • Sorry, I meant fourth. My point is that I don't want to have to adapt my code but might be dealing with data-frames of differnt sizes – sheß May 25 '16 at 10:19
  • 1
    `cbind(setNames(as.data.frame(df1^2), paste0(names(df1), "^2")), model.matrix(~ (. + .)^2 - . - 1, df1))` – Roland May 25 '16 at 10:21
3

Here is another option with combn where do the combination of column names taking two at a time, multiply the columns after subsetting and cbind with square of the original dataset.

res <- cbind(df1^2, do.call(cbind,combn(colnames(df1), 2, 
               FUN= function(x) list(df1[x[1]]*df1[x[2]]))))
colnames(res)[-(seq_len(ncol(df1)))] <-  combn(colnames(df1), 2, 
                 FUN = paste, collapse=":")
res
#            a           b           c        a:b        a:c         b:c
#1  0.08559952 0.365890531 0.008823729 0.17697473 0.02748285 0.056820059
#2  0.05057603 0.137444401 0.304984209 0.08337501 0.12419698 0.204739766
#3  0.49592997 0.451167798 0.525871254 0.47301970 0.51068123 0.487089495
#4  0.26925425 0.452905189 0.019023202 0.34920860 0.07156869 0.092820832
#5  0.43906475 0.102675746 0.049713853 0.21232357 0.14774167 0.071445132
#6  0.84721676 0.817486693 0.472890881 0.83221898 0.63296215 0.621757189
#7  0.07825199 0.039249934 0.005850588 0.05542008 0.02139673 0.015153719
#8  0.58342170 0.001953909 0.359676293 0.03376319 0.45808619 0.026509902
#9  0.64261164 0.250923183 0.397086073 0.40155468 0.50514566 0.315655035
#10 0.06488487 0.019260683 0.002174826 0.03535148 0.01187911 0.006472142
akrun
  • 874,273
  • 37
  • 540
  • 662
0

Here is my solution, clear and concise, and works for any number of columns:

n=ncol(df1)
combb=combn(n,2)
combb=cbind(combb, sapply(1:n, function(i) rep(i,2)))
res=apply(df1, 1, function(x) { apply(combb, 2, function(y) prod(x[y])) })
t(res)

          # [,1]       [,2]        [,3]       [,4]        [,5]        [,6]
 # [1,] 0.17697473 0.02748285 0.056820059 0.08559952 0.365890531 0.008823729
 # [2,] 0.08337501 0.12419698 0.204739766 0.05057603 0.137444401 0.304984209
 # [3,] 0.47301970 0.51068123 0.487089495 0.49592997 0.451167798 0.525871254
 # [4,] 0.34920860 0.07156869 0.092820832 0.26925425 0.452905189 0.019023202
 # [5,] 0.21232357 0.14774167 0.071445132 0.43906475 0.102675746 0.049713853
 # [6,] 0.83221898 0.63296215 0.621757189 0.84721676 0.817486693 0.472890881
 # [7,] 0.05542008 0.02139673 0.015153719 0.07825199 0.039249934 0.005850588
 # [8,] 0.03376319 0.45808619 0.026509902 0.58342170 0.001953909 0.359676293
 # [9,] 0.40155468 0.50514566 0.315655035 0.64261164 0.250923183 0.397086073
# [10,] 0.03535148 0.01187911 0.006472142 0.06488487 0.019260683 0.002174826
989
  • 12,579
  • 5
  • 31
  • 53