0

Problem

I am using the package [IRanges][1] and am in need to accurately code for very long sequences that overpass 2^31 by about 10-fold.

From the following, it seems that IRanges uses int32

##### INSTALLATION FROM SRC CODE ######
## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("IRanges")

##### CALL PACKAGE #####
require(IRanges)


IRanges(start=1,end=2^31-1) # Works fine

IRanges(start=1,end=2^31)   # Fail
Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : 
  solving row 1: range cannot be determined from the supplied arguments (too many NAs)
In addition: Warning message:
In .normargSEW0(end, "end") : NAs introduced by coercion to integer range

As this package is often used for DNA sequences, It would be very useful to be able to be able to deal with values that are greater than 2^32 (≈ 10^9) as many organisms have genome size longer than that.

Question

  • Am I right to think that this is an integer overflow issue?
  • Do you encounter the same issue?
  • Is there a way around this problem?
    • Is it possible (and easy) to find the source code and just modify the object type
    • Do you think there exists another version of this package?

The only solution I found is to accept to reduce my level of accuracy and divide each width by 100... but I am not very happy with decreasing my accuracy.

R version

R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Remi.b
  • 17,389
  • 28
  • 87
  • 168

1 Answers1

1

You are reaching the limit of the size of an integer that can be represented with R.

> .Machine$integer.max
#[1] 2147483647
> log2(.Machine$integer.max)
#[1] 31

There are special libraries like bit64 or gmp that be used to handle, e.g., 64 bit signed integers. However, it is not sure if integers as represented by such packages are compatible with other libraries.

RHertel
  • 23,412
  • 5
  • 38
  • 64
  • calling the package `bit64` does not seem to reach higher number within the package `IRanges`. I suppose I'll have to rewrite some of the functionalities of `IRanges` then. THanks – Remi.b Mar 13 '16 at 23:25
  • @Remi.b were you able to find a work-around or did you end up having to edit the `IRanges` source? – knowah Sep 28 '16 at 15:38
  • I made my own series of IRanges functions and I just worked with data.frame rather than creating a new class. It was actually very easy (it took maybe 3 hours debugging included). I haven't put this code on github as I don't use github but I should and I will soon. – Remi.b Sep 28 '16 at 16:00