0

I am new to SAS but familiar with R, MATLAB and Stata.

One thing I haven't been able to find yet while researching is the ability to work with individual observations throughout the data set. Assume this dataset is what I'm working with each step of the way:

Stock     Volume1 Volume2 Volume3
Apple     200     100     101
Amazon    150     1000    1020
Facebook  135     80      85
Google    80      75      80

I'll give some examples of what I'm looking to do.

  1. Taking the mean of segment 2 and 3's volume, such that I output a table:

    Volume (Avg)
    142.5
    
    1. Moreover, I'd be looking to take the average of a few columns within those segments e.g.

      Stock     Volume1 Volume2 Volume3 Volume Average
      Apple     200     100     101     133.67
      Amazon    150     1000    1020    723.33
      
                                        428.50
      
      Facebook  135     80      85      100
      Google    80      75      80      78.33
      
                                        89.165
      
                                                        258.8325
      

Generally, looking for the syntax that'll allow me to work within the dataset for operators like mean, sum, create tables and any kind of data manipulation.

78282219
  • 593
  • 5
  • 21
  • I've just found the SQL function group by within aggregate functions, I think this could be a good starting point. – 78282219 Feb 23 '18 at 07:44

2 Answers2

0

Your question is not posed very well, so I'll try to open up some ideas and then we can discuss.

You could add a row identifier to the dataset:

data yourdata2;
set yourOriginalData;
rownum = _N_;
run;

then you can use any of the various ways to compute means:

proc sql;
create table your_summarydata as
select avg(volume) as vol_mean
from yourdata2
where rownum in (2,3);
quit;

I was expanding on your original idea. I wouldn't recommend doing any serious calculations using this method. In real world I would make up some classifying variable for which I'd like the means to be calculated, and then use something like:

proc sql;
create table your_summarydata as
select groupvar, avg(volume) as vol_mean
from yourdata2
where 1=1 /* where conditions go here if you want to restrict the input dataset before calculations */
group by groupvar
quit;
jmr
  • 29
  • 7
  • I apologise, I was just editing the table to get my ideas across much better. – 78282219 Feb 23 '18 at 07:54
  • Row num and Col num seem to be what i'm looking for. In general, I'm looking for the syntax which allows me to pinpoint what cluster of observations I'm working with e.g. if I want to work with rows 2-4 and columns 3-8 only and find the mean of these observations. Appreciating your help. – 78282219 Feb 23 '18 at 07:56
  • Ok. Afaik, SAS doesn't work like that. SAS uses data much like SQL-databases: you have rows (which you can point to using a rownum) and you have columns which you can access using column names, not column numbers. That being said, there is a way to make multidimensional arrays in data-step, but I'm not an expert on the subject. Would have to look up the documentation later if I have time. – jmr Feb 23 '18 at 08:55
  • I took a look at your new tables.. Is the average of volumes you would like to achieve the 428.5, 89.165 and 258.8325 in that table? If so, I'd point you to my earlier answer; make a classifying variable which you can aggregate over with group by or some other SAS proc. And also, there is a SAS IML -module which allows you to write regular marix algebra operations, but I think it's not part of any usual SAS setup. – jmr Feb 23 '18 at 09:04
0

Your question is very broad, but here are a couple of basic related examples for you:

data have;
input Stock $ Volume1 Volume2 Volume3;
cards;
Apple     200     100     101
Amazon    150     1000    1020
Facebook  135     80      85
Google    80      75      80
;
run;

data row_means;
  set have;
  volume_mean = mean(of volume1-volume3);
run;

proc summary data = have;
  var volume1-volume3;
  output out = column_means mean=;
run;

For more highly customisable output, it sounds as though you might also be interested in proc report or proc tabulate, but a comprehensive guide to using those is beyond the scope of this site.

user667489
  • 9,501
  • 2
  • 24
  • 35