8

I am creating a SAS dataset from a database that includes a VARCHAR(5) key field.

This field includes some entries that use all 5 characters and some that use fewer.

When I import this data, I would prefer to pad all the shorter entries out to use all five characters. For this example, I want to pad on the left with 0, the character zero. So, 114 would become 00114, ABCD would become 0ABCD, and EA222 would stay as it is.

I've attempted this with a simple data statement, but of course the following does not work:

data test;
    set databaseinput;
    format key $5.;
run;

I've tried to do this with a user-defined informat, but I don't think it's possible to specify the ranges correctly on character fields, per this SAS KB answer. Plus, I'm fairly sure proc format won't let me define the result dynamically in terms of the incoming variable.

I'm sure there's an obvious solution here, but I'm just missing it.

purple_arrows
  • 265
  • 1
  • 2
  • 8

5 Answers5

6

Here is an alternative:

data padded_data_dsn; length key $5;
    drop raw_data;
    set raw_data_dsn(rename=(key=raw_data));
    key = translate(right(raw_data),'0',' ');
run;
ChrisP
  • 5,812
  • 1
  • 33
  • 36
5
Data raw_data_dsn;
format key $5.;
key = '4'; key1 = CATT(REPEAT('0',5-length(key)),key);output;
key = 'A114'; key1 = CATT(REPEAT('0',5-length(key)),key);output;
key = 'A1140'; key1 = CATT(REPEAT('0',5-length(key)),key);output;
run;
2

I'm sure someone will have a more elegant solution, but the following code works. Essentially it is padding the variable with five leading zeros, then reversing the order of this text string so that the zeros are to the right, then reversing this text string again and limiting the size to five characters, in the original order but left-padded with zeros.

data raw_data_dsn;
   format key $varying5.;
   key = '114'; output;
   key = 'ABCD'; output;
   key = 'EA222'; output;
run;

data padded_data_dsn;
   format key $5.;
   drop raw_data;
   set raw_data_dsn(rename=(key=raw_data));
   key = put(put('00000' || raw_data ,$revers10.),$revers5.);
run;
RWill
  • 939
  • 5
  • 6
0

Here's what worked for me.

data b (keep = str2);
    format str2 $5. ;
    set a;
    catlength = 4 - length(str); 
    cat = repeat('0', catlength);
    str2 = catt(cat, str); 
run;

It works by counting the length of the existing string, and then creating a cat string of length 4 - that, and then appending the cat value and the original string together.

Notice that it screws up if the original string is length 5. Also - it won't work if the input string has a $5. format on it.

data a; /*input dataset*/
    input str $;
    datalines;
    a
    aa
    aaa
    aaaa
    aaaaa
    ;
run;

data b (keep = str2);
    format str2 $5. ;
    set a;
    catlength = 4 - length(str); 
    cat = repeat('0', catlength);
    str2 = catt(cat, str); 
run;

input:
a
aa
aaa
aaaa
aaaaa

output:
0000a   
000aa   
00aaa   
0aaaa   
0aaaa   
dwjohnston
  • 11,163
  • 32
  • 99
  • 194
0

I use this, but only works with numeric values :S. Try with another formats in the INPUT

data work.prueba;
    format xx $5.;
    xx='1234';
    vv=PUT(INPUT(xx,best5.),z5.);
run;
Diego
  • 1