2

I need help conditionally adding leading or trailing zeros.

I have a dataframe with one column containing icd9 diagnoses. as a vector, the column looks like:

"33.27" "38.45" "9.25" "4.15" "38.45" "39.9" "84.1" "41.5" "50.3" 

I need all the values to have the length of 5, including the period in the middle (not counting ""). If the value has one digit before the period, it need to have a leading zero. If value has one digit after the period, it need to have zero at the end. So the result should look like this:

"33.27" "38.45" "09.25" "04.15" "38.45" "39.90" "84.10" "41.50" "50.30" 

Here is the vector for R:

icd9 <- c("33.27", "38.45", "9.25", "4.15", "38.45", "39.9", "84.1", "41.5", "50.3" )
tshepang
  • 12,111
  • 21
  • 91
  • 136
Maria Suprun
  • 41
  • 2
  • 7
  • Why do you need that? If for exporting to some tool that requires fixed-width records, spaces are the same as zeros, and easily achieved with `fwrite` . If for setting up a tablular display, use table command arguments to set the alignment. – Carl Witthoft Jan 06 '17 at 16:19

4 Answers4

10

This does it in one line

formatC(as.numeric(icd9),width=5,format='f',digits=2,flag='0')

James Tobin
  • 3,070
  • 19
  • 35
1

ICD-9 codes have some formatting quirks which can lead to misinterpretation with simple string processing. The icd package on CRAN takes care of all the corner cases when doing ICD processing, and has been battle-tested over about six years of use by many R users.

Jack Wasey
  • 3,360
  • 24
  • 43
0

Using this function called change that accepts the argument of the max number of characters, i think it can help

 change<-function(x, n=max(nchar(x))) gsub(" ", "0", formatC(x, width=n))
    icd92<-gsub(" ","",paste(change(icd9,5)))
Keniajin
  • 1,649
  • 2
  • 20
  • 43
0

You can also use sprintf after converting the vector into numeric.

sprintf("%05.2f", as.numeric(icd9))
[1] "33.27" "38.45" "09.25" "04.15" "38.45" "39.90" "84.10" "41.50" "50.30"

Notes

  • The examples in ?sprint to get work out the proper format.
  • There is some risk of introducing errors due to numerical precision here, though it works well in the example.
lmo
  • 37,904
  • 9
  • 56
  • 69