I am processing records from a large dataset with varying lengths using data.table[, somefunc(someseries), by=]
. The length L of each record someseries
could be anything from 1 to 50. I want to handle the following efficiently without needlessly adding an if
expression:
For each group, I want the simplest way to access its middle entries someseries[3:(L-2)]
Problem: beware that when L<5, the expression someseries[3:(L-2)]
actually misbehaves by inferring backwards direction. This is due to the default "helpful" behavior of [from:to]
which uses
seq(from..., to..., by = ((to - from)/(length.out - 1) ...)
i.e. infers backwards direction by=-1
In that case I just want somefunc to get passed an empty vector()
not someseries[4:2]
But you can't explicitly do seq(... by=1)
because that errors if from > to
.
Here's a testcase:
set.seed(15)
ragged_arrays <- lapply(ceiling(runif(5,1,5)), function(n) (1:n) )
# indexing with unwanted auto-backwards
lapply(ragged_arrays, function(someseries) someseries[2 : (length(someseries)-2)] )
For the sake of our testcase, somefunc
is a function which behaves gracefully when passed an empty vector, e.g. median()