This is my first post in stackoverflow. I've tried to ask the question according to the guidelines.
I am currently working on a marginal structural model. I've read most of the articles by Hernan on this topic and worked through his book What if and the provided R code. However, all the examples I find are rather straight forward. For example, patients either receive or don't receive treatment A at start of follow-up and remain on that treatment during the whole time. I find it difficult to apply what I've learned to my dataset.
My dataset is about time-to-event data. During a patient's follow-up time, they can receive treatment A. Some patients already receive treatment A at start of follow-up, whereas other start later during their follow-up or not at all. Additionally patients who are on treatment A can discontinue the treatment and restart the treatment again during their follow-up, i.e. a time-varying treatment.
Here I provide an example dataset with code of a crude MSM of time-to-event data. Instead of a Cox PH regression, I use a pooled logistic regression of which the odds ratio is similar to the hazard ratio of the Cox model. To keep it simple, I did not calculate censoring weights.
set.seed(1948)
idvar<-1:100 #100 subjects
v1<-sample(c(0,1), replace=TRUE, size=length(idvar)) #One baseline variable
time<-sample(1:100, replace=TRUE,size=length(idvar)) #Follow-up time
event<-sample(c(0,1), replace=TRUE, size=length(idvar)) #End of follow-up event. If 0 patients either reached administrative censoring or a competing event
df<-data.frame(idvar,v1,time,event) #Create dataframe
library(splitstackshape)
df.expand<-expandRows(df,"time", drop = F) #Create dataframe with 1 row per 1 unit of time per subject
df.expand$time_1 <- sequence(rle(df.expand$idvar)$lengths)-1 #Calculates times in expanded dataframe
df.expand$event_true <- as.factor(ifelse(df.expand$time_1==df.expand$time-1 & #Only event at last follow-up
df.expand$event==1, 1, 0))
df.expand$treatment<-sample(c(0,1), replace=TRUE,size=nrow(df.expand)) #Time-varying treatment
df.expand$v2<-sample(c(0,1), replace=TRUE,size=nrow(df.expand)) #Time-varying covariate
#Calculate weights. Need help here
pred_dnom = predict(glm(treatment~factor(v1) + factor(v2), #Time-varying covariates only in denominator
data = df.expand,
family = binomial(link = "logit")), type = "response")
pred_num = predict(glm(treatment~factor(v1),
data = df.expand,
family = binomial(link = "logit")), type = "response")
df.expand$sw <- ifelse(df.expand$treatment == 0, ((1 - pred_num) / (1 - pred_dnom)),
(pred_num / pred_dnom))
##Normally I would also calculate censoring weights and multiply those with the treatment weights##
fit1<-glm(event_true==1~treatment+factor(time_1),
data = df.expand, weights = sw,
family = binomial(link = "logit"))
However, in the example above the weights to not account for previous treatment history and time-varying confounder treatment history. So my question is, how to incoporate these histories into time-varying treatment. This article by Fewell et al. describes the method to do so for time-varying treatment but where the patient cannot quit and restart treatment. I quote:
To estimate each subject’s probability of their complete treatment history up to each month (the denominator in 3), we multiply the estimated probabilities of their observed treatment during each month cumulatively over time. The first estimated probability for each subject is left as it is. For all others, the estimated probability at the current time point is multiplied by the estimated probability at the previous time point. [Fewell Z, Hernán MA, Wolfe F, Tilling K, Choi H, Sterne JAC. Controlling for Time-dependent Confounding using Marginal Structural Models. The Stata Journal. 2004;4(4):402-420. doi:10.1177/1536867X0400400403]
As can be seen in the code of their article, once patients start treatment, their weights are set to 1. Another method I know a lot of people use for calculating weights is the ipw package. However, this package also sets weights to 1 when a patient starts treatment, after which the weights also remain 1.
One of the authors of the ipw package also contributed to the article by Grafféo et al. which extended the ipw package to allow time-varying weights. They do so by calculating weights for initiating therapy when not on therapy and weights for discontinuing therapy while on therapy. (Code available as supporting information in the article) However, they did not consider the histories of treatments/time-varying confounders.
More precisely, we did not consider the whole history of confounding treatments exposure but only considered that the exposure of interest at time t only depended on the confounding treatments given at time t. Then, instead of computing the weights up to the first time when the exposure of interest is modified, the weights are computed for all time points, that is, for all changes in the treatment of interest status and for all event times. [Grafféo, N, Latouche, A, Geskus, RB, Chevret, S. Modeling time-varying exposure using inverse probability of treatment weights. Biometrical Journal. 2018; 60: 323– 332. https://doi.org/10.1002/bimj.201600223]
for which they suggest the following
We did not take into account confounders' history. This could be addressed using a different model for each switch in the relationship between the exposure of interest and the confounders. Further work is needed to study this modeling options. The simplest way would be to include a counter of the previous exposure as a covariate in the treatment model. [Grafféo, N, Latouche, A, Geskus, RB, Chevret, S. Modeling time-varying exposure using inverse probability of treatment weights. Biometrical Journal. 2018; 60: 323– 332. https://doi.org/10.1002/bimj.201600223]
So the method by Fewell does incorporate treatment and time-varying variable histories but does not allow weights to change after starting therapy. The method by Grafféo does allow for weights to vary over time but does not incorporate treatment and time-varying variable histories. So what I actually want is to combine these methods, but I absolutely don't know how that would be possible.
I hope my question is clear (and asked correctly) and somebody has a suggestion. Thanks!