0

I am new to R and posting so please forgive me if I miss some protocols, but here is my question: I am creating temporary vectors in order to add '0s' where needed. I ultimately want a value comprised of 12 digits and where this is not the case, I am going to add the amount of '0s' that I require. However, after attempting to paste my temporary indices with the appropriate zeros, I get the following message:

colnames(ALLMBRS) <- c("SSN","tracts","GeoBlock","GeoCodeBlck","GeoMatch") #TA Members Tracts
#Remove special characters and decimals
tmp1 <- str_replace_all(ALLMBRS$GeoCode,"[[:punct:]]","")
#Temporary Vector of ALLMBRS
tmp2 <- tmp1
#Vectors of Indices used to add 0's
add1 <- str_length(ALLMBRS$tracts) == 11
add2 <- str_length(ALLMBRS$tracts) == 10
add3 <- str_length(ALLMBRS$tracts) == 9
add4 <- str_length(ALLMBRS$tracts) == 8
add5 <- str_length(ALLMBRS$tracts) == 7
#Paste temporary vector indices into temporary vector
tmp2[add1] <- paste(tmp2[add1],"0",sep="")
tmp2[add2] <- paste(tmp2[add2],"00",sep="")
tmp2[add3] <- paste(tmp2[add3],"000",sep="")
tmp2[add4] <- paste(tmp2[add4],"0000",sep="")
tmp2[add5] <- paste(tmp2[add5],"00000",sep="")

Example of Data:

[1] "0"            "0"            "0"            "0"            "0"            "0"           
 [7] "0"            "360010146121" "720210310133" "0"            "517100023001" "90034808002" 
[13] "250158202021" "250158211004" "250138125003" "290470203002" "250138124031" "250158202033"
[19] "250138019012" "250138112002"

I expect all values to contain 12 digits. So I would like to see for

[1]000000000000 

and for

[12]900348080020

Error Message: Error in tmp2[add1] <- paste(tmp2[add1],"0",sep = ""):
NAs are not allowed in subscripted assignments

If I do have NA's in my data how can I circumvent this so I can accomplish my task. Thank you for any help.

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
user3067851
  • 524
  • 1
  • 6
  • 20

1 Answers1

1

You can use str_pad from stringr to pad the strings. Set the pad argument to "0"

> x <- c("0", "0", "0", "0", "0", "0", "0", "360010146121",
         "720210310133", "0", "517100023001", "90034808002",
         "250158202021", "250158211004", "250138125003", 
         "290470203002", "250138124031", "250158202033",
         "250138019012", "250138112002")
> library(stringr)
> str_pad(x, 12, pad = "0")
# [1] "000000000000" "000000000000" "000000000000" "000000000000"
# [5] "000000000000" "000000000000" "000000000000" "360010146121"
# [9] "720210310133" "000000000000" "517100023001" "090034808002"
#[13] "250158202021" "250158211004" "250138125003" "290470203002"
#[17] "250138124031" "250158202033" "250138019012" "250138112002"

Update: For vectors containing any NA values, you can do

x[!is.na(x)] <- str_pad(x[!is.na(x)], 12, pad = "0")

to pad the values and leave the NAs untouched. For example,

> y <- c("0", NA, "123", "68")
> y[!is.na(y)] <- str_pad(y[!is.na(y)], 12, pad = "0")
> y
# [1] "000000000000" NA             "000000000123" "000000000068"
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
  • Thank you Richard. I am receiving an error message: 'invalid times value'. Would this be because I have NAs as some of the values? – user3067851 Sep 24 '14 at 16:33
  • @user3067851 I just tried it, and yes it seems so. Try `x[!is.na(x)] <- str_pad(x[!is.na(x)], 12, pad = "0"); x` and see if that works. And I'll update the answer if that works for you. – Rich Scriven Sep 24 '14 at 16:36