-1

this is in R

ok so i've used cook distances to identify the points i would like to remove from a dataset of 506 variables that i have.

i am able to remove ONE point (number 369) as follows:

modelmc1 = lm(housing[-369,14] ~ housing[-369,1] + housing[-369,2] + 
housing[-369,3] + housing[-369,4] + housing[-369,5] + housing[-369,6] + 
housing[-369,7] + housing[-369,8] + housing[-369,9] + housing[-369,10] + 
housing[-369,11] + housing[-369,12] + housing[-369,13])

my question is how do i remove MULTIPLE points (around 30)

thanks

germcd
  • 954
  • 1
  • 12
  • 24
  • have you stored the points in an object? say like outlier.index or something? then do this `housing.wo.outliers <-housing[-outlier.index,]` and for `lm` do this `lm("var.name" ~ ., data = housing.wo.outliers)`, where var.name is the name of the column at index 14 – infominer Apr 29 '15 at 15:58

1 Answers1

1

You can leave out multiple rows in a data frame using a vector c().

modelmc1 = lm(housing[c(-361, -367, -369),14] ~ housing[c(-361, -367, -369),1] + ...)
germcd
  • 954
  • 1
  • 12
  • 24