0

I have a data set with three columns: Names, ColA and ColB. There's a number of rows for each name, and for each name I need to subtract B from A, and make a new column get the remaining value of the differece between A and B:

Names  ColA    ColB   NewColA
  x      100       5      95
  x      100      20      75
  x      100      10      65 

Is this possible? I've tried using IF-Then statements, Do-While and considered a macro but my head is still stuck in Excel/VBA mode so I'm not sure how to do it?

NHansen
  • 101
  • 2
  • 8

1 Answers1

2

Your question is very similar to doing a running total with a by grouping. An example of that is found here: http://support.sas.com/kb/24/649.html

The secret is using the "first" implicit variables.

data out;
    set [replace with your input dataset]; 
    by names;
    retain newColA;
    if first.names then newcola=cola;
    newcola = newcola - colb;
run;

[edit] I forgot the retain statement. Here goes an example using the fish dataset from sashelp. (Although it doesn't make sense doing that there.)

First off, the dataset must be sorted. If yours is already you can go straight to the data step.

proc sort data = sashelp.fish(where=(Weight ne .) drop=Length1-Length3) out = fish nodupkey force;
    by species weight height;
run;

data out;
    set fish;
    by species;
    retain newColA;
    if first.species then newColA  = weight;
    newColA = newColA - height;
run;
Tacio Medeiros
  • 367
  • 2
  • 10
  • The problem with the "first" method is that it groups my data together. I need all the lines to remain in my data set, only adding the one column with the calculation. – NHansen Sep 11 '13 at 13:38
  • add `retain NewColA;` to the above. – DomPazz Sep 11 '13 at 13:42
  • @user2754574. The "first" doesn't group the data. The link I posted does that with an extra "if" checking for "last" to output. That's how he groups the data, without that all lines are outputted. – Tacio Medeiros Sep 11 '13 at 14:11
  • It keeps substracting the first obs. value from each line, instead of shifting to the 2. and 3. value. Every time it just minusses the 5, not the next value of the column – NHansen Sep 11 '13 at 14:12
  • I tested the exact code posted with your own dataset, it works just fine. Are you sure you put the right variable in the retain statement? It should be `retain newColA;` – Tacio Medeiros Sep 11 '13 at 14:37
  • I don't know why your suggestion just doesn't works as is. I've had to put in a "else if not first.name" to get the calculations working, but now it seems to work. Thanks a lot. – NHansen Sep 12 '13 at 07:38