dcast()
is used for reshaping data from long to wide format (thereby aggregating) while the OP wants to reshape from wide to long format thereby filling the missing timestamps.
There is an alternative approach which uses a non-equi join.
Prepare data
However, startTime
and endTime
need to be turned into numeric variables after removing the trailing "s"
before we can proceed.
library(data.table)
cols <- stringr::str_subset(names(DF), "Time$")
setDT(DF)[, (cols) := lapply(.SD, function(x) as.numeric(stringr::str_replace(x, "s", ""))),
.SDcols = cols]
Non-equi join
A sequence of timestamps covering the whole period is created and right joined to the dataset but only those timestamps are retained which fall within the given intervall. From the accepted answer, it seems that endTime
must not be included in the result. So, the join condition has to be adjusted accordingly.
DF[DF[, CJ(time = seq(min(startTime), max(endTime), 0.1))],
on = .(startTime <= time, endTime > time), nomatch = 0L][
, endTime := NULL][] # a bit of clean-up
startTime word
1: 1.9 hey
2: 2.0 hey
3: 2.1 hey
4: 2.2 hey
5: 2.3 I'm
6: 2.4 I'm
7: 2.5 I'm
8: 2.6 I'm
9: 2.7 I'm
10: 2.8 John
11: 2.9 John
12: 3.0 right
13: 3.1 right
14: 3.2 right
15: 3.3 right
16: 3.4 now
17: 3.5 I
18: 3.6 I
19: 3.7 I
20: 3.8 help
21: 3.9 help
22: 4.0 help
23: 4.1 help
24: 4.2 help
startTime word
Note that this approach does not require to introduce row numbers.
nomatch = 0L
avoids NA rows in case of gaps in the dialogue.
Data
library(data.table)
DF <- fread("
rn startTime endTime word
1 1.900s 2.300s hey
2 2.300s 2.800s I'm
3 2.800s 3s John
4 3s 3.400s right
5 3.400s 3.500s now
6 3.500s 3.800s I
7 3.800s 4.300s help
", drop = 1L)