General newbie when it comes to time series data analysis in R. I am having trouble translating a bit of Stata code into R code for a replication project I am doing.
The intent of the Stata code and the Stata code (from the original analysis) are the following:
#### Delete extra yearc observations with different wartypes #####
drop if yearc==yearc[_n+1] & wartype!="CIVIL"
drop if yearc==yearc[_n-1] & wartype!="CIVIL"
So, translated, I keep the rows in which the country is having a civil war and delete the rows in which there is an interstate war during the same years.
I have named the data object (i.e., the data set)
mywar
in R.
I am assuming I somehow do a conditional ifelse statement, or something similar, such as:
invisible(mywar$yearc <- ifelse(mywar$yearc==n-1 | mywar$yearc==n+1 | mywar$wartype!=civil, NA,
mywar$yearc)) # I am assuming I cannot condition ifelse statements like this; but, this is how I imagine it
mywar <- mywar[!is.na(mywar$yearc),]
EDIT: So perhaps an example
> b <- c(1970, 1970, 1970, 1971, 1982, 1999, 1999, 2000, 2001, 2002)
> c <- c("inter", "civil", "intra", "civil", "civil", "inter", "civil", "civil", "civil", "civil")
> df <- data.frame(b,c)
> df$j <- ifelse(df$b==n-1 & df$b==n+1 & df$c!="civil", NA, df$b)
> df
b c j
1 1970 inter 1970
2 1970 civil 1970
3 1970 intra 1970
4 1971 civil 1971
5 1982 civil 1982
6 1999 inter 1999
7 1999 civil 1999
8 2000 civil 2000
9 2001 civil 2001
10 2002 civil 2002
So, what I was trying to do was create NAs for rows 1,3,and 6 as they are duplicate years in my logistic regression on the onset of civil war (I am not interested in inter and intra wars, however defined) so that I can delete these rows from my data set. Here, I just recreated row b. (Note, what is missing from this made up data are the country ids. But assume that these ten entries represent the same country (for instance, Somalia)). So, I am interested in how to delete these type of rows in a data set with 28,000 rows.