Consider data
created here:
set.seed(123)
data <- data.frame(time = c(1:1000, 1:1030, 1:2000), ID = rep(c('a','b','c'), c(1000,1030,2000)), variable = rnorm (4030))
We have measured some process variable
over the life of 3 individuals, each of which is denoted with a unique ID
. Assume that time
is a function of each individuals size, so each ID
has a different length. We have noticed some pattern or property within each time series that is consistent for each ID
, meaning it happened in all of the time series (i.e., at some point in each ID
's life), but they are at different time
s.
Lets say this happens at time == 500
for individual a
, at time == 560
for individual b
, and time == 602
for individual c
.
We can create a typical line plot to show each of the time series like this:
library(tidyverse)
data%>%
ggplot(aes(time,variable, color = ID))+
geom_point()+
geom_line()
But how can we center each time series on these "common data points" to show how similar this "pattern" is in each ID
? In other words, in the plot, ID == c
will start first because it is the longest, and the data point 602 for c
will be displayed at the same place as data point 500 for ID == a
and data point 560 for ID == b
.