1

Hello I have an issue with reading in data

I have a csv that im trying to read into sas and the phone number is in the form of (###)###-#### and I want to create an informat that reads in the #'s only.

user3648688
  • 69
  • 1
  • 5

2 Answers2

1

The only way to create an informat of this type would be to specify an individual character string and informatted value for every possible phone number you want to match, which obviously isn't a practical approach.

However, you can easily create a new variable that contains only the digits from the raw imported text, e.g.

data _null_;
    phone_raw = '(123)456-7890';
    length phone_digits $10;
    phone_digits = compress(phone_raw,,'dk');
    put phone_raw= / phone_digits=;
run;
user667489
  • 9,501
  • 2
  • 24
  • 35
1

You can create an informat that removes the characters and only reads the numbers:

proc format;
  invalue comph (default=20)
  's/\D//' (regexpe) = _same_;
run;
data _null_;
  a=input("(423)456-7890",comph.);
  put a=;
run;

The example will output the numbers correctly. No validation of the phonenumber is performed. The strange string 's/\D//' is a regular expression and should be read as: Replace all non-numbers (\D) with nothing.

Stig Eide
  • 1,052
  • 7
  • 14
  • 1
    If you want to do some data validation as well, you could consider creating a function and using that in the informat. This example creates an informat, tsgn, that uses the output from the (user-written) function tsgn as invalue: proc format; invalue tsgn(default=10) other=[tsgn()]; – Stig Eide Nov 10 '14 at 14:44
  • 1
    Nice - didn't realise informats could contain regex! – Allan Bowe Nov 12 '14 at 20:31