5

I need to set certain numeric values in one column of my data frame to zero, if in another column they have a certain factor level.

My dataframe df looks something like:

Items Store.Type
5      A
4      B
3      C
6      D
3      B
7      E

What I want to do is make Items = 0, for all rows where Store.Type = "A" or "C"

I'm very new to R, but figured this this would be a conditional statement of the form "If Store.Type A then Items <- 0" (and then repeat for Store.Type C), but I didn't understand the ?"if" page at all. I tried:

df$ItemsFIXED <- with(df, if(Store.Type == "A")Items <-0)

and got the warning message:

Warning message:
In if (Store.Type2 == "Chain - Brand") Total.generic.items <- 0 :
 the condition has length > 1 and only the first element will be used`

So I noticed here, the following:

  • if is a control flow statement, taking a single logical value as an argument
  • ifelse is a vectorised function, taking vectors as all its arguments.

So figuring I need ifelse to do the whole column and being able to understand the ?ifelse page, I tried to do "If Store.Type A then Items <- 0 else do nothing". In fact I wanted it nested, so I tried the following code (creating a new column for now so I don't mess up my data, but eventually it will overwrite the Items data)

df$ItemsFIXED <- with(df, ifelse(Store.Type == "A", Items <-0, 
                          ifelse(Store.Type == "C", Items <-0,)))

and got the following error:

Error in ifelse(Store.Type2 == "Franchise - Brand", Total.generic.items <- 0,  : 
  argument "no" is missing, with no default

But if I put anything in for no it simply writes over the values which are correct. I tried putting Items and Items <- Items in to say "else leave Items as Items" as in the following, but this just changed everything to zero.

df$ItemsFIXED <- with(df, ifelse(Store.Type == "A", Items <-0, 
                          ifelse(Store.Type == "C", Items <-0,Items)))

Is there a way to tell ifelse to do nothing, or is there an easier way to do this?

Community
  • 1
  • 1
JenLouise
  • 65
  • 2
  • 4
  • 1
    `df$Items[which(df$Store.Type == "A" | df$Store.Type == "C" )] <- 0` – Alex Sep 18 '14 at 03:00
  • i.e. find the rows that need to be changed, then set those entries to 0. – Alex Sep 18 '14 at 03:02
  • Thanks everyone for those solutions, they do indeed all work, but I'm not exactly sure how/why yet! I think the `which` and `%in%` seem to be the simplest ones that I should look into more. – JenLouise Sep 18 '14 at 06:52
  • 1
    Oh and @Alex I would vote for your answer but I don't know how, given that it is in a comment... – JenLouise Sep 18 '14 at 06:54

5 Answers5

5

Or you could use %in% for multiple match/replacement

 df$Items[df$Store.Type %in% c("A", "C")] <- 0
  df
  #Items Store.Type
  #1     0          A
  #2     4          B
  #3     0          C
  #4     6          D
  #5     3          B
  #6     7          E
akrun
  • 874,273
  • 37
  • 540
  • 662
2

Using within seems to be also an option:

within(d, Items[Store.Type %in% c("A","C")]<-0)

  Items Store.Type
1     0          A
2     4          B
3     0          C
4     6          D
5     3          B
6     7          E
ddiez
  • 1,087
  • 11
  • 26
1

You can use vectorized replacement here. If df is your data set,

> df$Items[with(df, Store.Type == "A" | Store.Type == "C")] <- 0L
> df
#   Items Store.Type
# 1     0          A
# 2     4          B
# 3     0          C
# 4     6          D
# 5     3          B
# 6     7          E

with(df, Store.Type == "A" | Store.Type == "C") returns a logical vector. When a logical vector is placed inside [...], only the TRUE values are returned. So if we subset Items with those values, we can replace them with [<-

Also, if you wanted to use ifelse, you could do things like

df$Items <- with(df, ifelse(Store.Type == "A" | Store.Type == "C", 0L, Items))

or

within(df, Items <- ifelse(Store.Type == "A" | Store.Type == "C", 0L, Items))

but take note that ifelse can be very slow at times, even more so when coupled with within, and will likely always be slower than the vectorized method up top.

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
1

Following also works:

> ddf[ddf$Store.Type=='A'| ddf$Store.Type=='C',]$Items = 0
> ddf
  Items Store.Type
1     0          A
2     4          B
3     0          C
4     6          D
5     3          B
6     7          E
rnso
  • 23,686
  • 25
  • 112
  • 234
0

it is the best way to solve

df$Items[which(df$store.type==c('A','C'))]==0

 Items Store.Type

1     0          A


2     4          B

3     0          C

4     6          D

5     3          B

6     7          E
Greenonline
  • 1,330
  • 8
  • 23
  • 31