22

I have the following data frame:

> str(df)
'data.frame':   3149 obs. of  9 variables:
 $ mkod : int  5029 5035 5036 5042 5048 5050 5065 5071 5072 5075 ...
 $ mad  : Factor w/ 65 levels "Akgün Kasetçilik         ",..: 58 29 59 40 56 11 33 34 19 20 ...
 $ yad  : Factor w/ 44 levels "BAKUGAN","BARBIE",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ donem: int  201101 201101 201101 201101 201101 201101 201101 201101 201101 201101 ...
 $ sayi : int  201101 201101 201101 201101 201101 201101 201101 201101 201101 201101 ...
 $ plan : int  2 2 3 2 2 2 7 3 2 7 ...
 $ sevk : int  2 2 3 2 2 2 6 3 2 7 ...
 $ iade : int  0 0 3 1 2 2 6 2 2 3 ...
 $ satis: int  2 2 0 1 0 0 0 1 0 4 ...

I want to remove 21 specific rows from this data frame.

> a <- df[df$plan==0 & df$sevk==0,]
> nrow(a)
[1] 21

So when I remove those 21 rows, I will have a new data frame with 3149 - 21 = 3128 rows. I found the following solution:

> b <- df[df$plan!=0 | df$sevk!=0,]
> nrow(b)
[1] 3128

My above solution uses a modified logical expression (!= instead of == and | instead of &). Other than modifying the original logical expression, how can I obtain the new data frame without those 21 rows? I need something like that:

> df[-a,] #does not work

EDIT (especially for the downvoters, I hope they understand why I need an alternative solution): I asked for a different solution because I'm writing a long code, and there are various variable assignments (like a's in my example) in various parts of my code. So, when I need to remove rows in advancing parts of my code, I don't want to go back and try to write the inverse of the logical expressions inside a-like expressions. That's why df[-a,] is more usable for me.

Mehper C. Palavuzlar
  • 10,089
  • 23
  • 56
  • 69
  • -1 You have a solution contained within the question. There is no problem to solve (as the question is currently worded). – Richie Cotton Oct 27 '11 at 13:10
  • 1
    @RichieCotton: My solution uses a modified (different) logical expression which ends up with the result I need; but what I want to see is how to remove specific rows from a data frame. I included my solution in my question because I didn't want to see it in the answers. – Mehper C. Palavuzlar Oct 27 '11 at 13:16
  • I've added a few lines to my question to explain what I want to know. – Mehper C. Palavuzlar Oct 27 '11 at 13:22
  • I think there is confusion over why you want something like `df[-a,]`, when `df[df$plan!=0 | df$sevk!=0,]` seems to be the correct approach. Could you comment why, in the bigger picture, something like `df[-a,]` is preferable? Perhaps, in the bigger picture, there is an approach which avoids this problem. – jthetzel Oct 27 '11 at 21:50
  • It's because I'm writing a long code, and there are various variable assignments (like `a`'s in my example) in various parts of my code. So, when I need to remove rows in advancing parts of my code, I don't want to go back and try to write the inverse of the logical expressions inside `a`-like expressions. That's why `df[-a,]` is more usable for me. – Mehper C. Palavuzlar Oct 28 '11 at 06:53

5 Answers5

15

Just negate your logical subscript:

a <- df[!(df$plan==0 & df$sevk==0),]
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
12

You can use the rownames to specify a "complementary" dataframe. Its easier if they are numerical rownames:

df[-as.numeric(rownames(a)),]

But more generally you can use:

df[setdiff(rownames(df),rownames(a)),]
James
  • 65,548
  • 14
  • 155
  • 193
9

Are you looking for subset()?

dat <- airquality
dat.sub <- subset(dat, Temp > 80 & Month < 10)

dim(dat)
dim(dat.sub)

Applied to your example:

df.sub <- subset(df, plan != 0 & sevk != 0)
jthetzel
  • 3,603
  • 3
  • 25
  • 38
2

You're almost there. 'a' needs to be a vector of indices:

    df <- data.frame(plan=runif(10),sevk=runif(10))
    a <- c(df$plan<.1 | df$sevk < .1) # some logical thing
    df[-a,]

or, with your data:

    a <- c(df$plan==0 & df$sevk==0)
    df[-a,]
tim riffe
  • 5,651
  • 1
  • 26
  • 40
  • I tried the last two lines of your code with my data, but it gives the wrong result (3148 rows instead of 3128). (BTW, `b[-a,]` should be `df[-a,]` I guess) – Mehper C. Palavuzlar Oct 27 '11 at 12:48
  • sorry about the slop- it works with my self-contained little example above, so I guess whatever is going on with your data is over my head – tim riffe Oct 27 '11 at 13:16
0

I don't see why you object to your solution, but here's another way.

which( df[df$plan==0 & df$sevk==0,], arr.ind=TRUE) ->killlist 
newdf <- df[-c(killlist[1,])] 
Carl Witthoft
  • 20,573
  • 9
  • 43
  • 73