3

I sometimes have to deal with very low p values and present them in a tabular format. The values returned by R can have long significance digits (i.e. digits after the decimal point). Now since the p value is anyways so low , I tend to shorten them before writing them into a .xls or .tsv files.(Just to make the tables look pretty !!)

I am using R version 3.0.0 (2013-04-03)

Some background and examples :

9.881313e-208 in R will be 9.88e-208 in my table

I can use the round function in R to do this.

round(9.881313e-208, 210)
[1] 9.88e-208

However the power value of e can differ in every case, and since there are many such cases I use the following formula :-

x = 9.881313e-208
round(x,abs(floor(log10(x)-2)))   ## I came to this following trial and error
[1] 9.88e-208

I have tested this formula empirically and it works in different cases like :-

a <- c(1.345678e-150,8.543678e-250,5.555555e-303, 0.01123, 4.523456e-290)
round(a,abs(floor(log10(a)-2)))
[1] 1.35e-150 8.54e-250 5.56e-303  1.12e-02 4.52e-290

Now the problem starts when the power of e exceeds the number 306 (even 307 is fine, but starts getting strange after 308)

## Example 1:
b <- c(1.345678e-306,1.345678e-307,1.345678e-308, 1.345678e-309, 1.345678e-310)
round(b,abs(floor(log10(b)-2)))
[1] 1.35e-306 1.30e-307 1.00e-308  0.00e+00  0.00e+00

## Example 2:
b <- c(3.345678e-306,3.345678e-307,3.345678e-308, 3.345678e-309, 3.345678e-310)
round(b,abs(floor(log10(b)-2)))

[1] 3.35e-306 3.30e-307 3.00e-308  0.00e+00  0.00e+00

## Example 3:
b <- c(7.345678e-306,7.345678e-307,7.345678e-308, 7.345678e-309, 7.345678e-310)
round(b,abs(floor(log10(b)-2)))

[1] 7.35e-306 7.30e-307 7.00e-308 1.00e-308  0.00e+00

Also, I checked these directly:

round(7.356789e-306,308)
[1] 7.36e-306

round(7.356789e-307,309)
[1] 7.4e-307

round(7.356789e-308,310)
[1] 7e-308

round(7.356789e-309,311)
[1] 1e-308

round(7.356789e-310,312)
[1] 0

round(7.356789e-311,313)
[1] 0

Am I missing something trivial here or does the round function hit a resolution limit beyond e-308. I know these values are extremely low and is almost equal to zero, however I would still prefer to have exact value. I saw some answers in SO using Python for this problem, (See How right way to round a very small negative number?) but are there any suggestions as to how can I overcome this in R ?

Help much appreciated

Cheers

Ashwin

Ashwin
  • 329
  • 1
  • 4
  • 12
  • 2
    Have you tried `format()` rather than `round()`? – Andrie May 21 '14 at 13:03
  • I'm with Andrie here: `round` should be used to manipulate the stored values; `format` or `sprintf` should be used to manipulate the **display** of values. – Carl Witthoft May 21 '14 at 13:09
  • 3
    If you have such low p-values you should work with their logs. However, normally differences between extremely low p-values are neither meaningful nor reliable. Thus, `format.pval` reports them as `"< [eps]"`, with `eps` equal to `.Machine$double.eps`. – Roland May 21 '14 at 13:13
  • Didn't know about `format` @Andrie, thanks for that, however it now seems `format(2.147484e-325, digits=3,scientific = T)` returns `[1] "0e+00"`. Am I using `format` the right way ? – Ashwin May 21 '14 at 13:16
  • @Roland for visualizing these low p values I do use their logs but here I just wanted to write them in a text file in a shortened way .. – Ashwin May 21 '14 at 13:22
  • 3
    2.147484e-325 is smaller than the smallest positive value that can be represented exactly in IEEE 754 binary floating point, which is about 4.9E-324. The results may be being affected by the small number of significant bits in the representation of subnormal numbers, numbers below about 2.22e-308. – Patricia Shanahan May 21 '14 at 14:30

2 Answers2

5

This answer is based on the assumption that R's floating point numbers are being represented by IEEE 754 64-bit binary numbers. That is consistent with the reported results, and inherently very likely.

Doing much arithmetic on numbers with absolute magnitude below about 2e-308 is very problematic. Below that point, precision drops with magnitude. The smallest representable number, about 4.9E-324, has one significant bit in its representation. Its pair of adjacent numbers are 0 and about 1.0E-323. Any rounding error will either reduce it to zero or double it. It cannot experience a subtle rounding error that only affects low significance digits in its decimal representation. Similarly, round cannot change it slightly. It can return it unchanged, double it, return 0, or make an even bigger change.

See Denormal number for more explanation of what is happening.

The solution is to, if at all possible, avoid doing arithmetic on such tiny numbers. One good approach, as already pointed out in comments, is to use logarithms. That is the only way to go if you need to deal with very large as well as very small numbers.

If that is not possible, and your numbers are all small, consider scaling by a moderately large power of two. Powers of two that are in range are exactly representable, and multiplying by them only changes the exponent, with no rounding error. You could scale all the numbers you are going to store in the text file by a constant before rounding.

Patricia Shanahan
  • 25,849
  • 4
  • 38
  • 75
  • 1
    This is right on: see `?.Machine`, and particularly `.Machine$double.xmin` for R- and platform-specific information. – Ben Bolker May 21 '14 at 15:32
-1

To overcome this, you can use the Rmpfr library.

Example:

library(Rmpfr)
x <- mpfr(7.356789e-311, 80)
round(x, 313)
1 'mpfr' number of precision  1040   bits 
[1]7.36000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000008e-311
art
  • 181
  • 1
  • 9