-1

I have a variable (FTA) that has 2 options (yes or no), and I want to create a dummy variable to replace it with yes=1 and no=0. From time period (t) 3 and onwards, it should equal 1, and before that should be 0.

df<-dummy.data.frame(df, names=c("FTA"), sep="_")

After inputting this line of code, I can't see any difference from before when I view the summary of the data (it still counts the number of no's and yes's in the column below the variable name).

I also tried doing:

dummy <- as.numeric(t >= 3)

dummy2 <- as.numeric(t < 3)

As well as:

ifelse(t >=3, 1, 0)

But I still can't observe any changes in the summary. Have I done this correctly, and what can I do to view the dummy variable I created and to replace the old one with it?

Edit: Example of data

My goal is to create a dummy variable that replaces "FTA".

Rahul Agarwal
  • 4,034
  • 7
  • 27
  • 51

2 Answers2

0

Is this what you want? (Based on the value 4 as the critical watershed in the OP)

# Data:
t <- c(1:10)
FTA <- sample(c("yes", "no"), 10, replace = T)
df <- data.frame(t, FTA)
df
    t FTA
1   1 yes
2   2 yes
3   3 yes
4   4  no
5   5  no
6   6  no
7   7 yes
8   8  no
9   9 yes
10 10 yes

# Change `FTA` based on two conditions:
df$new <-ifelse(df$t >= 4 &df$FTA=="yes", 1, 
            ifelse(df$t >= 4 &df$FTA=="no", 0, as.character(df$FTA)))
df
    t FTA new
1   1 yes yes
2   2 yes yes
3   3 yes yes
4   4  no   0
5   5  no   0
6   6  no   0
7   7 yes   1
8   8  no   0
9   9 yes   1
10 10 yes   1
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
  • Thank you! If I run a linear regression, how can I insert this dummy variable instead of the "FTA" variable that has "yes" and "no"? – user10831611 Dec 25 '18 at 11:25
  • I don't know your regression and your variables. But if one of them is the one I called `new` you will have to subset your dataframe first so as to *exclude* those rows in which you still have "yes" and "no". This can be done thus (for consistency, again with "4"; do feel free to change): `df_subs <- df[!df$t < 4, ]` – Chris Ruehlemann Dec 25 '18 at 11:35
  • If the above is correct and your regression is linear, you'd simply do: `lm(df_subs$whateverVariable ~ df_subs$new)` or the other way round depending on which is the response, which is the explanatory variable. – Chris Ruehlemann Dec 25 '18 at 11:39
0

You can do the following:

# sample data frame
df <- data.frame(t = c(1,2,3,4,5,6), flag = c('no','yes','yes','yes','yes','yes'))

# encode the values
df$flag <- ifelse(df$flag == 'yes',1, 0)

# set values as 0 before time = 3
df[df$t < 3, c('flag')] <- 0

  t flag
1 1    0
2 2    0
3 3    1
4 4    1
5 5    1
6 6    1
YOLO
  • 20,181
  • 5
  • 20
  • 40