I have a integer64
indexed data.table
object:
library(data.table)
library(bit64)
some_data = as.integer64(c(1514772184120000026, 1514772184120000068, 1514772184120000042, 1514772184120000078,1514772184120000011, 1514772184120000043, 1514772184120000094, 1514772184120000085,
1514772184120000083, 1514772184120000017, 1514772184120000013, 1514772184120000060, 1514772184120000032, 1514772184120000059, 1514772184120000029))
#
n <- 10
x <- setDT(data.frame(a = runif(n)))
x[, new_col := some_data[1:n]]
setorder(x, new_col)
Then I have a bunch of other integer64
that I need to binary-search for in the indexes of my original data.table
object (x
):
search_values <- some_data[(n+1):length(some_data)]
If these where native integers I could use findInterval()
to solve the problem:
values_index <- findInterval(search_values, x$new_col)
but when the arguments to findInterval
are integer64
, I get:
Warning messages:
1: In as.double.integer64(vec) :
integer precision lost while converting to double
2: In as.double.integer64(x) :
integer precision lost while converting to double
and wrong indexes:
> values_index
[1] 10 10 10 10 10
e.g. it is not true that the entries of search_values
are all larger than all entries of x$new_col
.
Edit:
Desired output:
print(values_index)
9 10 6 10 1
Why?:
value_index
has as many entries as search_values
. For each entries of search_values
, the corresponding entry in value_index
gives the rank that entry of search_values
would have if it where inserted inside x$new_col
. So the first entry of value_index
is 9
because the first entry of search_values
(1514772184120000045
) would have rank 9
among the entries of x$new_col
.