1

Suppose, I have dummy data like this:

x1  x2  x3  x4  x5  x6
26  25  30  0   23  27
24  0   26  0   22  30
20  0   24  0   21  21
27  0   26  0   27  25
22  0   0   0   28  22
20  0   0   0   24  20
22  0   0   0   20  27
22  0   0   0   23  28
30  0   0   0   27  24
23  0   0   0   24  22
26  0   0   0   26  26

I need clean this data.
1. delete all сolumns with zero values (for eample x4)
2. delete all сolumns with the number of non-zero values less than 5(x2-x3).

Is it possible write that function or loop?

G5W
  • 36,531
  • 10
  • 47
  • 80
psysky
  • 3,037
  • 5
  • 28
  • 64
  • 4
    Try `Filter(function(x) !(any(x==0)|any(x <5)), df1)` or it could be `Filter(function(x) !(all(x==0)|any(x <5)), df1)` or could be `Filter(function(x) !(all(x==0)|sum(x!=0)<5), df1)` or `Filter(function(x) sum(x!=0)>=5, df1)` – akrun Nov 06 '17 at 12:28

1 Answers1

5

Try this:

# sample data
x1 <- c(0,2,5,7,2,3,0,3)
x2 <- c(2,3,0,0,1,0,4,0)
x3 <- c(0,0,0,0,0,0,0,0)
x4 <- c(2,5,1,2,3,4,5,6)

df <- data.frame(x1,x2,x3,x4)

df <- df[,!colSums(df != 0) < 5]

#same result, it's just the logic that is inversed
df <- df[,colSums(df != 0) >= 5]

So this dataframe

> df
  x1 x2 x3 x4
1  0  2  0  2
2  2  3  0  5
3  5  0  0  1
4  7  0  0  2
5  2  1  0  3
6  3  0  0  4
7  0  4  0  5
8  3  0  0  6

becomes this:

> df
  x1 x4
1  0  2
2  2  5
3  5  1
4  7  2
5  2  3
6  3  4
7  0  5
8  3  6
f.lechleitner
  • 3,554
  • 1
  • 17
  • 35