I am facing a big issue in computing the cumsum()
of a vector. The vector has a length of ~ 10,000 elements and from element say 2000 the values go down to 1e-310. To give a feeling of the distribution I am dealing with, here is a plot.
When I try to apply cumsum()
I get lots of ones, which is impossible, and a minimum value around 10^-2. I am porting a code which we developed in Matlab and of course no problems there. For some reason, R seems to have troubles in working with such small numbers to the extent that applying standard functions returns unexpected, and wrong, results.
I searched over stack overflow and found these two posts:
- R: Number precision, how to prevent rounding?
- Controlling number of decimal digits in print output in R
Unfortunately, none of them helped me out.
I also tried to use Rcpp cumsum()
function with no luck. I guess the problem comes directly from the precision of my matrix object.
I am not even sure how to reproduce this so I am happy to share my 9137 x 2 matrix. I am completely stuck with this.
Looking forward to hearing from you guys!
Thanks
Update
Here is a sample of 100 elements from my matrix:
y <- sample( BestPair, 100 )
dput( y )
c(7.74958917761984e-289, 4.19283869686715e-319, 1.52834266884531e-319,
2.22089175309335e-297, 4.93980517192742e-298, 1.37861543592719e-301,
1.47044459800611e-317, 6.49068860911021e-319, 1.83302927898675e-305,
8.39514422452147e-312, 2.88487711616485e-300, 0.000544461085044608,
0.000435738736513519, 1.35649914843994e-309, 4.30826678309556e-310,
2.60728322623343e-319, 0.000544460617547516, 5.28815204888643e-299,
0.00102710912090133, 0.00198425117943324, 1.99711912768841e-304,
8.34594499227505e-306, 7.42055412763084e-300, 5.00039717762739e-311,
1.8750204972032e-305, 1.06513324565406e-310, 5.00487443690634e-313,
3.4890421843663e-319, 7.48945537292364e-310, 1.92948452007191e-310,
1.19840058299897e-305, 0.000532438536688165, 6.53966533658245e-318,
0.000499821676385928, 2.02305525482572e-305, 5.18981575493413e-306,
8.82648276295387e-320, 7.30476057376283e-320, 1.23073492422415e-291,
4.1801705284367e-311, 5.10863383734203e-318, 1.12106998189819e-298,
9.34823978505262e-297, 0.00093615863896107, 5.3667092510628e-311,
3.85094526994501e-319, 1.3693720559483e-313, 3.96230943126873e-311,
2.03293191294298e-319, 2.38607510351427e-291, 0.000544460855322345,
1.74738584846597e-310, 1.41874408662835e-318, 5.73056861298345e-319,
3.28565325597139e-302, 3.5412805275117e-310, 1.19647007227024e-302,
1.71539915106223e-305, 2.10738303243284e-311, 6.47783846432427e-313,
5.0072402480979e-303, 7.7250380240544e-303, 9.75451890703679e-309,
0.000533945755492525, 0.00211359631486468, 1.6612179399628e-312,
0.000521804571338402, 4.12194185271951e-308, 1.12829365794294e-313,
8.89772702908418e-319, 5.092756929242e-312, 7.45208240537024e-311,
6.60385177095196e-298, 0.000544461017733648, 1.62108867188263e-318,
3.95135528339003e-309, 1.8792966379072e-292, 5.98494480819088e-295,
0.00051614492665081, 2.25198141886419e-300, 7.97467977809552e-305,
1.78098757558338e-311, 1.66946525895122e-313, 0.000778442249425894,
6.58100207570114e-312, 0.00120733768329515, 3.44432924341767e-319,
6.38151190880713e-313, 7.1129669216109e-300, 4.11319531475755e-319,
7.21747577033383e-304, 1.48709312807403e-318, 1.39519898470211e-303,
4.58585270141592e-312, 2.16309869205282e-295, 7.55248601743105e-314,
3.16365476892733e-310, 1.96961507010996e-305, 3.21125377499206e-318,
3.66277772043574e-304)
Update 2
Apparently, imposing the following:
BestPair[ BestPair < .Machine$double.eps ] <- 0
does not solve the issue. Still finding weird results from cumsum()
. Here is a plot to better explain what I am dealing with. The Cumulative Prob. has this shape because BestPair
has been sorted by decreasing order. I want to have the 1 from cumsum()
on top of my vector.
Here is a summary of the ob
> summary(CumProb)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0250 1.0000 1.0000 0.9685 1.0000 1.0000
Update 3. Results from Matlab
Here is the result as computed with Matlab. As you can see, I can get a pretty decent distribution which I cannot achieve in R even if I truncated the original matrix.