0

Why does this code not need two trim statements, one for first and one for last name? Does the length statement remove blanks?


data work.maillist; set cert.maillist;
length FullName $ 40;
fullname=trim(firstname)||' '||lastname; 
run;
Ryan M
  • 37
  • 9

3 Answers3

1

length is a declarative statement and introduces a variable to the Program Data Vector (PDV) with the specific length you specify. When an undeclared variable is used in a formula SAS will assign it a default length depending on the formula or usage context.

Character variables in SAS have a fixed length and are padded with spaces on the right. That is why the trim(firstname) is needed when || lastname concatenation occurs. If it wasn't, the right padding of firstname would be part of the value in the concatenation operations, and might likely exceed the length of the variable receiving the result.

There are concatenation functions that can simplify string operations

  • CAT same as using <var>|| operator
  • CATT same as using trim(<var>)||
  • CATS same as using trim(left(<var>))||
  • CATX same as using CATS with a delimiter.
  • STRIP same as trim(left(<var>))

Your expression could be re-coded as:

fullname = catx(' ', firstname, lastname);
Richard
  • 25,390
  • 3
  • 25
  • 38
  • Note that your CATX() call will produce a different result when FIRSTNAME empty. In that case the CATX() call will not add the leading spaces that the version that uses `TRIM()` and `||` outputs. – Tom Apr 10 '20 at 00:55
  • Thank you! Would it be right to say that when using the concatenation operator that you don't need to trim trailing blanks on the last variable? for example, = trim (one) || trim (two) || trim (three) || four;? – Ryan M Apr 10 '20 at 13:40
0

Is there a reason you think it should? Can you see trailing spaces in the surname, have you tried a length() function?
I could be wrong here but sometimes when you apply a function (put especially) or import data you can inadvertently store leading or trailing spaces. Trailing spaces are a mystery because you don't realise they are there until you try to do something else with the data.
A length statement should allow you to store exactly the data you give it providing you use a number/character variable correctly with truncation only occurring if the length value is too short. I've found the compress() function to be the most convenient for dealing with white space and punctuation particularly if you are concatenating variables.

https://www.geeksforgeeks.org/sas-compress-function-with-examples/

All the best,

Phil

blake
  • 481
  • 4
  • 14
  • Note that COMPRESS() does a totally different thing than TRIM() or STRIP(). It will remove the spaces between the individual words in the string you pass it. And you can also use it to remove characters other than space. – Tom Apr 10 '20 at 01:36
0

Because SAS will truncate the value when it is too long to fit into FULLNAME. And when it is too short it will fill in the rest of FULLNAME with spaces anyway so there is no need to remove them.

It would only be an issue if the length of FULLNAME is smaller than the sum of the lengths of FIRSTNAME and LASTNAME plus one. Otherwise the result cannot be too long to fit into FULLNAME, even if there are no trailing spaces in either FIRSTNAME or LASTNAME.

Try it yourself with non-blank values so it is easier to see what is happening.

1865  data test;
1866    length one $1 two $2 three $3 ;
1867    one = 'ABCD';
1868    two = 'ABCD';
1869    three='ABCD';
1870    put (_all_) (=);
1871  run;

one=A two=AB three=ABC
NOTE: The data set WORK.TEST has 1 observations and 3 variables.
Tom
  • 47,574
  • 2
  • 16
  • 29