2

I want to format figures with 2 significant digits using formatC. But it has a strange behaviour. Here are the figures:

(x <- data.frame(  cil = c(1.234, 0.444, 0.712, 0.999, 1.999)
                 , ciu = c(1.812, 1.234, 0.999, 1.199, 2.690)
                 )
 )

x$ci <- with(x,paste("("
              , formatC(cil, format="g", digits=2, flag="#")
              , "-"
              , formatC(ciu, format="g", digits=2, flag="#")
              ,")"
                      )
              )
x

And here are the results:

      cil   ciu         ci
1     1.234 1.812  ( 1.2 - 1.8 )
2     0.444 1.234 ( 0.44 - 1.2 )
3     0.712 0.999 ( 0.71 - 1.0 )
4     0.999 1.199  ( 1.0 - 1.2 )
5     1.999 2.690  (  2. - 2.7 )

In case 5 I expected 2.0 and not 2.. Is there an explanation for this? Did I something wrong with the definition of the parameters?

giordano
  • 2,954
  • 7
  • 35
  • 57
  • maybe it does'nt really answer your question but you can use `format` with `nsmall` : `format(1.999, digits=2, nsmall=2)` gives you `"2.00"` (which I actually find better than `"2.0"` when you're displaying values rounded to the 2nd decimal) – Cath Dec 19 '14 at 10:01
  • @giordano I couldn't reproduce the problem. – akrun Dec 19 '14 at 10:02
  • @giordano Could you show the R version? I am using `R 3.1.2` – akrun Dec 19 '14 at 10:31
  • @akrun Thanks for help. My version is: R version 3.1.0 (2014-04-10). – giordano Dec 19 '14 at 10:36
  • @giordano There might be some changes with each upgrade. It may be best to try it on the new version. BTW, I am using it on linux. – akrun Dec 19 '14 at 10:36
  • @akrun I tried also with R version 3.1.2 on Windows 7 (the former was also Windows 7). Maybe it has to be something to with the OS. – giordano Dec 19 '14 at 10:53
  • @giordano May be, I am not sure. – akrun Dec 19 '14 at 10:54
  • @giordano You could use regex too to solve the problem. Please check my update. – akrun Dec 19 '14 at 11:48
  • @akrun Yes, you'r right. I wanted to avoid this and I'was wondering if I didn't get the concept of significant digits or if I didn't use the correct way to formatC. Obviously, since on linux you got the expected value, it has to do with Windows 7. Which is also curious. – giordano Dec 19 '14 at 13:46

2 Answers2

1

Using R 3.1.2 on linux mint 17, I couldn't reproduce the problem as I am getting the exact result as your expected output. But, here is an option to use paste along with do.call in case you have many columns (or in general)

1) Using formatC

x$ci <- paste0("(",do.call(`paste`, c(lapply(x[,2:3], function(x) 
 formatC(x, format='g',   digits=2, flag='#')), list(sep=" - "))) ,")")
x$ci
#[1] "(1.2 - 1.8)"  "(0.44 - 1.2)" "(0.71 - 1.0)" "(1.0 - 1.2)"  "(2.0 - 2.7)" 

Note: that the above is exactly the same expected output.

2) Using sprintf

Another option that I consider would be to use sprintf if you need similar output as @CathG showed before the update

x$ci <- paste0("(",do.call(`paste`, c(lapply(x[2:3], function(x) 
                       sprintf('%0.2f', x)), list(sep="-"))),")")
x$ci
#[1] "(1.23-1.81)" "(0.44-1.23)" "(0.71-1.00)" "(1.00-1.20)" "(2.00-2.69)"

Update

You could use regex to solve the problem. For example, I created the 5th entry as the same as your output and use regex lookbehind

x$ci[5] <- "( 2. - 2.7 )"
sub('(?<=\\.) ', '0', x$ci, perl=TRUE)
#[1] "( 1.2 - 1.8 )"  "( 0.44 - 1.2 )" "( 0.71 - 1.0 )" "( 1.0 - 1.2 )" 
#[5] "( 2.0- 2.7 )"  
akrun
  • 874,273
  • 37
  • 540
  • 662
  • this gives me the exact same result as with the OP's code : `"( 1.2-1.8 )" "( 0.44-1.2 )" "( 0.71-1.0 )" "( 1.0-1.2 )" "( 2.-2.7 )"` (which is not that surprising) – Cath Dec 19 '14 at 10:16
  • @akrun Thanks for this version using lapply. As expected I got the same result as using my version. Nevertheless, it's a good idea to use your version especially if there a lot of columns to format in the same way. – giordano Dec 19 '14 at 11:05
  • @akrun it seems that I have to use the workaround with regex to cope with the "bug". Thanks! – giordano Jan 11 '15 at 20:28
1

To illustrate what I was saying in my comment, you can do :

x$ci<-with(x,paste("(",
                   format(cil,digits=2,nsmall=2),
                   "-",
                   format(ciu,digits=2,nsmall=2),")"))
> x
  case   cil   ciu              ci
1    A 1.234 1.812 ( 1.23 - 1.81 )
2    B 0.444 1.234 ( 0.44 - 1.23 )
3    C 0.712 0.999 ( 0.71 - 1.00 )
4    D 0.999 1.199 ( 1.00 - 1.20 )
5    E 1.999 2.690 ( 2.00 - 2.69 )

or the following, to suppress the spaces before or after the brackets :

x$ci<-with(x,paste0("(",
                    format(cil,digits=2,nsmall=2),
                    " - ",
                    format(ciu,digits=2,nsmall=2),")"))
> x
  case   cil   ciu          ci
1    A 1.234 1.812 (1.23 - 1.81)
2    B 0.444 1.234 (0.44 - 1.23)
3    C 0.712 0.999 (0.71 - 1.00)
4    D 0.999 1.199 (1.00 - 1.20)
5    E 1.999 2.690 (2.00 - 2.69)

NB : you actually can get the same result using function formatC but with format="f" instead of "g".

UPDATE :

I guess the fact that 0 is not printed after 2. is just a bug in some R versions (weirder thing : if you try the line with 2.01 instead of 1.999, you'll get "2.0"...).

To make it work with your line and obtain exactly what you want, just add round function :

x$ci<-with(x,paste("(",
                    formatC(round(cil,2), format="g", digits=2, flag="#"),
                   "-",
                   formatC(round(ciu,2), format="g", digits=2, flag="#"),")"))

> x
  case   cil   ciu             ci
1    A 1.234 1.812  ( 1.2 - 1.8 )
2    B 0.444 1.234 ( 0.44 - 1.2 )
3    C 0.712 0.999 ( 0.71 - 1.0 )
4    D 0.999 1.199  ( 1.0 - 1.2 )
5    E 1.999 2.690  ( 2.0 - 2.7 )
Cath
  • 23,906
  • 5
  • 52
  • 86