2

I need help to split a row into multiple rows when the value on the row is something like 1-5. The reason is that I need to count 1-5 to become 5, and not 1, as it is when it count on one row.

I've a ID, the value and where it belong.

As exempel:

ID  Value Page
1    1-5   2

The output I want is something like this:

ID Value Page
1    1    2
1    2    2
1    3    2
1    4    2
1    5    2

I've tried using a IF-statement

IF bioVerdi='1-5' THEN
        DO;
            ..
        END;

So I don't know what I should put between the DO; and END;. Any clues to help me out here?

user1770961
  • 101
  • 2
  • 2
  • 9

2 Answers2

3

You need to loop over the values inside your range and OUTPUT the values. The OUTPUT statement causes the Data Step to write a record to the output data set.

data want;
set have;
if bioVerdi = '1-5' then do;
   do value=1 to 5;
      output;
   end;
end;
DomPazz
  • 12,415
  • 17
  • 23
3

Here is another solution that is less restricted to the actual value '1-5' given in your example, but would work for any value in the format '1-6', '1-7', '1-100', etc.

*this is the data you gave ;
data have ; 
    ID = 1 ; 
    value = '1-5';
    page = 2;
run;

data want ; 
 set have ; 

 min = scan( value, 1, '-' ) ; * get the 1st word, delimited by a dash ;
 max = scan( value, 2, '-' ) ; * get the 2nd word, delimited by a dash ;

 /*loop through the values from min to max, and assign each value as the loop iterates to a new column 'NEWVALUE.' Each time the loop iterates through the next value, output a new line */
 do newvalue = min to max ;
    output ; 
 end;

 /*drop the old variable 'value' so we can rename the newvalue to it in the next step*/
 drop value min max; 

 /*newvalue was a temporary name, so renaming here to keep the original naming structure*/
 rename newvalue = value ; 

run;
Heidi
  • 86
  • 3
  • 1
    Won't the scan function return text variables instead of numeric variables? – Reeza Apr 27 '15 at 03:13
  • @Reeza - Yes, the scan function creates variables MIN and MAX which are character variables storing min = '1' and max = '2'. However, SAS automatically converts character indices that "look" like numbers to numbers in the do loop -- so `do newvalue = '1' to '5'` is handled as `do newvalue =1 to 5`, and newvalue keeps the numeric index. – Heidi Apr 27 '15 at 18:42
  • Doesn't that leave a log in the message to that effect? Generally implicit conversions are considered incorrect :) – Reeza Apr 27 '15 at 21:21
  • Yes, it does leave: "NOTE: Character values have been converted to numeric values...". If it's a bother you can easily just wrap the scan function with the input function to convert it and store it as numeric from the beginning: `min = input( scan( value, 1, '-' ), 8.) ;` This is a minor detail so I left it from the solution for clarity. – Heidi Apr 27 '15 at 22:08
  • Ah, yes. That may be a better solution, since it's scalable. The only problem is that i drops all the values that not is written in this way '1-5'. I got value that also is just '5' f.ex. – user1770961 Apr 28 '15 at 18:54
  • I also get a error message: 'ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the BY expression is missing, zero, or invalid.' – user1770961 Apr 28 '15 at 19:06
  • @user1770961 It sounds like you might have some nonstandard data in your dataset.. like " - 5" or " - 6" -- Do a proc freq of your data to see if they are all the same format. You should see that they are all only separated by dashes as well. It could be the case that there are some bizarre characters in there (like 5:6 instead of 5-6), in which case you should use the trnwrd function to replace those with dashes – Heidi Apr 29 '15 at 01:09