1

If you are familiar with SVM, we can move data to higher dimension in order to deal with non-linearity.

I want to do that. I have 19 features and I want to do this:

for any pair of features x_i and x_j I have to find :

     sqrt(2)*x_i*x_j

and also square of each features

       ( x_i)^2

so new features will be:

    (x_1)^2, (x_2)^2,...,(x_19)^2, sqrt(2)*x_1*x_2, sqrt(2)*x_1*x_3,... 

at the end removing columns whose values are all zero

example

        col1    col2     col3    
          1      2        6

new data frame

        col1      col2     col3    col4              col5           col6     
        (1)^2    (2)^2    (6)^2    sqrt(2)*(1)*(2)   sqrt(2)*(1)*(6)   sqrt(2)*(2)*(6)
  • @MrFlick I made an small example , I hope it is enough –  Sep 11 '19 at 15:05
  • This is a statistical question actually. So the right platform is [CrossValidated](https://stats.stackexchange.com/). But, generally speaking, to increase the dimension, there are numerous [methods](https://data-flair.training/blogs/svm-kernel-functions/). Maybe you want to select one, and try to imply it to R by writing your own function. If you get stuck in some points, then it is better to ask it at this platform at that moment with the information like: desired function, your effort, error you faced etc. – maydin Sep 11 '19 at 15:35

1 Answers1

1

I use data.table package to do these kind of operations. You will need gtools as well for making the combination of the features.

# input data frame
df <- data.frame(x1 = 1:3, x2 = 4:6, x3 = 7:9)

library(data.table)
library(gtools)
# convert to data table to do this
dt <- as.data.table(df)
# specify the feature variables
features <- c("x1", "x2", "x3")

# squares columns
dt[, (paste0(features, "_", "squared")) := lapply(.SD, function(x) x^2),
   .SDcols = features]

# combinations columns
all_combs <- as.data.table(gtools::combinations(v=features, n=length(features), r=2))
for(i in 1:nrow(all_combs)){
  set(dt,
      j = paste0(all_combs[i, V1], "_", all_combs[i, V2]),
      value = sqrt(2) * dt[, get(all_combs[i, V1])*get(all_combs[i, V2])])
}

# convert back to data frame
df2 <- as.data.frame(dt)
df2
Jonny Phelps
  • 2,687
  • 1
  • 11
  • 20