5

I have new question related with this my topic deleting outlier in r with account of nominal var. In new case variables x and x1 has different lenght

x <- c(-10, 1:6, 50)
x1<- c(-20, 1:5, 60)
z<- c(1,2,3,4,5,6,7,8)

bx <- boxplot(x)
bx$out

bx1 <- boxplot(x1)
bx1$out


x<- x[!(x %in% bx$out)]
x1 <- x1[!(x1 %in% bx1$out)]


x_to_remove<-which(x %in% bx$out)
x <- x[!(x %in% bx$out)]

x1_to_remove<-which(x1 %in% bx1$out)
x1 <- x1[!(x1 %in% bx1$out)]

z<-z[-unique(c(x_to_remove,x1_to_remove))]
z  

data.frame(cbind(x,x1,z))

then i get the warning

Warning message:
In cbind(x, x1, z) :
  number of rows of result is not a multiple of vector length (arg 2)

so in new dataframe the obs. of Z is not corresponding to x and x1. How can i decide this problem? This solustion is not help me Rsolnp: In cbind(temp, funv) : number of rows of result is not a multiple of vector length (arg 1) or i just do anything wrong.

Edit

x_to_remove<-which(x %in% bx$out)
x <- x[!(x %in% bx$out)]

x1_to_remove<-which(x1 %in% bx1$out)
x1 <- x1[!(x1 %in% bx1$out)]

z<-z[-unique(c(x_to_remove,x1_to_remove))]
z  

d=data.frame(cbind(x,x1,z))
d

it is wrong Warning message:

In cbind(x, x1, z) :
  number of rows of result is not a multiple of vector length (arg 2)

d

  x x1 z
1 1  1 2
2 2  2 3
3 3  3 4
4 4  4 5
5 5  5 6
6 6  1 2

How on this 3 columg get this output

Na  Na  Na
1   1   2
2   2   3
3   3   4
4   4   5
5   5   6
Na  Na  Na
Na  Na  Na

the six row (d) is superfluous

Community
  • 1
  • 1
San.O
  • 87
  • 1
  • 2
  • 6
  • 1
    Well, you can't column bind 3 vector of different lengths, I don't understand what is your goal. – cirofdo May 29 '18 at 12:31

1 Answers1

1

Differents lengths in original x, x1 and z lists is the first problem, how can you say which z values is related to each x and x1 values?

x <- c(-10, 1:6, 50)
x1<- c(-20, 1:5, 60)
z<- c(1,2,3,4,5,6,7,8)
length(x)
[1] 8
length(x1)
[1] 7
length(z)
[1] 8

Another problem is here:

x<- x[!(x %in% bx$out)] #remove this
x1 <- x1[!(x1 %in% bx1$out)] #remove this


x_to_remove<-which(x %in% bx$out)
x <- x[!(x %in% bx$out)]

x1_to_remove<-which(x1 %in% bx1$out)
x1 <- x1[!(x1 %in% bx1$out)]

You clean x and x1 before calculating x_to_remove and x1_to_remove

EDIT: To achieve your desired output try this code (/ode lines added signed in comments):

x <- c(-10, 1:6, 50)
x1<- c(-20, 1:5, 60)
z<- c(1,2,3,4,5,6,7,8)

length_max<-min(length(x),length(x1),length(z)) #Added: identify max length before outlier detection

bx <- boxplot(x)
bx1 <- boxplot(x1)

x_to_remove<-which(x %in% bx$out)
x <- x[!(x %in% bx$out)]

x1_to_remove<-which(x1 %in% bx1$out)
x1 <- x1[!(x1 %in% bx1$out)]

z<-z[-unique(c(x_to_remove,x1_to_remove))]

length_min<-min(length(x),length(x1),length(z)) #Minimum length after outlier remove

d=data.frame(cbind(x[1:length_min],x1[1:length_min],z[1:length_min])) #Bind columns
colnames(d)<-c("x","x1","z")

d_NA<-as.data.frame(matrix(rep(NA,(length_max-length_min)*3),nrow=(length_max-length_min))) #Create NA rows
 colnames(d_NA)<-c("x","x1","z")

d<-rbind(d,d_NA) #Your desired output
d
   x x1  z
1  1  1  2
2  2  2  3
3  3  3  4
4  4  4  5
5  5  5  6
6 NA NA NA
7 NA NA NA
Terru_theTerror
  • 4,918
  • 2
  • 20
  • 39
  • How can i decide this two problems? – San.O May 29 '18 at 12:41
  • 1
    The first is related to the logic of your data, if you understand your data and your problem you already have the answer. The second problem is easy, remove code lines with comment "#remove this" in my answer. – Terru_theTerror May 29 '18 at 12:43