0

I am an R learner. I am working on 'Human Activity Recognition' dataset from internet. It has 563 variables, the last variable being the class variable 'Activity' which has to be predicted.

I am trying to use KNN algorithm here from CARET package of R.

I have created another dataset with 561 numeric variables excluding the last 2 - subject and activity.

I ran the PCA on that and decided that I will use the top 20 PCs.

pca1 <- prcomp(human2, scale = TRUE)

I saved the data of those PCs in another dataset called 'newdat'

newdat <- pca1$x[ ,1:20]

Now I am tryig to run the below code : but it gives me error because, this newdat doesn't have my class variable

trctrl <- trainControl(method = "repeatedcv", number = 10, repeats = 3)
set.seed(3333)
knn_fit <- train(Activity ~., data = newdat, method = "knn",
                 trControl=trctrl,
                 preProcess = c("center", "scale"),
                 tuneLength = 10)

I tried to extract the last column 'activity' from the raw data and appending it using cbind() with 'newdat' to use that on knn-fit (above) but its not getting appended.

any suggestions how to use the PCs ?


Below is the code:

human1 <- read.csv("C:/NIIT/Term 2/Prog for Analytics II/human-activity-recognition-with-smartphones (1)/train1.csv", header = TRUE)
humant <- read.csv("C:/NIIT/Term 2/Prog for Analytics II/human-activity-recognition-with-smartphones (1)/test1.csv", header = TRUE)

#taking the predictor columns
human2 <- human1[ ,1:561]


pca1 <- prcomp(human2, scale = TRUE)
newdat <- pca1$x[ ,1:15]
newdat <- cbind(newdat, Activity = as.character(human1$Activity))

pca1 <- preProcess(human1[,1:561], 
                   method=c("BoxCox", "center", 
                            "scale", "pca"))
PC = predict(pca1, human1[,1:561])


trctrl <- trainControl(method = "repeatedcv", number = 10, repeats = 3)
set.seed(3333)
knn_fit <- train(Activity ~., data = newdat, method = "knn",
                 trControl=trctrl,
                 preProcess = c("center", "scale"),
                 tuneLength = 10)

#applying knn_fit to test data

test_pred <- predict(knn_fit, newdata = testing)
test_pred

#checking the prediction
confusionMatrix(test_pred, testing$V1 )

I am running into error in the below part. I have attached with the error:

> knn_fit <- train(Activity ~., data = newdat, method = "knn",
+                  trControl=trctrl,
+                  preProcess = c("center", "scale"),
+                  tuneLength = 10)
Error: cannot allocate vector of size 1.3 Gb
Flexo
  • 87,323
  • 22
  • 191
  • 272
  • Hi! I think you clicked the wrong [edit] link. You tried to edit an answer to your question instead of editing your question. See the edit review here https://stackoverflow.com/review/suggested-edits/18005793 – Андрей Беньковский Nov 20 '17 at 15:02

1 Answers1

0

How have you tried to cbind the column, could you please show the code? I think you simply stepped into the difficulties produced by StringsAsFactors = TRUE. Does the following line solve your problem:

#...
#newdat <- pca1$x[ ,1:20]    
newdat <- cbind(newdat, Activity = as.character(human2$Activity))
Manuel Bickel
  • 2,156
  • 2
  • 11
  • 22
  • Thanks Manuel, I did as suggested by you. The code below is taking the dataset now, but I am getting error asexplained:......................................> >trctrl <- trainControl(method = "repeatedcv", number = 10, repeats = 3) > set.seed(3333) > knn_fit <- train(Activity ~., data = newdat, method = "knn", + trControl=trctrl, + preProcess = c("center", "scale"), + tuneLength = 10)............................................................................................................. Error: cannot allocate vector of size 412.4 Mb – Aravindh Rajan Nov 20 '17 at 12:48
  • Could you please edit your question and post the full output of your error. Furthermore, please also provide some more information on the data you are using via `dput(human2)` or if this is too long `dput(head(human2))`. – Manuel Bickel Nov 20 '17 at 12:56
  • I have edited my initial post as I do not have enough characters here. I have given the full code I am using along with the error. – Aravindh Rajan Nov 20 '17 at 13:51