I have the following variables over 5 years. Name, DOB, DateDX, LabTest, LabResult
I need to deduplicate to count matching lab tests and results only once per patient within a 365 day period.
The last line of code deduplicates all of the cases counting each unique patient/LabTest/LabResult only once. But I need to add the >365 day time period based on DateDX.
I've looked into creating a loop or a function, but I find these difficult to think through. example I found looks something like this, but I'm lost in trying to adapt it.
# function for identifying dates. An
cases<-function(x,lag=12){
return(diff(log(x),lag))
}
j = 1
for(i in MyData){
n = which(Last(pp1) == i)
returnsmatrix[,j] = rets(Last[,n],1)
j=j+1
}
here is my code so far
library(readr)
library(data.table)
library(plyr)
library(dplyr)
library(dtplyr)
library(tidyr)
library(htmtools)
data <- group_by(data, TestResult)
datasummary <- summarise(data,
Test1=length(which(TestType=="1")),
Test2=length(which(TestType=="2")),
Test3=length(which(TestType=="3")),
Test4=length(which(TestType=="4"))
)
dedupdata <- data[!duplicated(data[,c("Name", "DOB", "LabTest", "LabResult")]),]
Can i just add a criteria that will look at the data of the row and select the nex match where DateDX => 365?
I expect to be able to get multiple TestType and TestResults back for a patient that will match but they need to be at least one year a part based on DateDX.