0

I have a dataset which represents the volume of sales over three years:

data test;
input one two three average;
datalines;
10 20 30 .
20 30 40 .
10 30 50 .
10 10 10 .
;
run;

I'm looking for a way to find the middle point of the three years, the average sale point

the updated dataset would read

data test;
input one two three average;
datalines;
10 20 30 2
20 30 40 1.5
10 30 50 2.1
10 10 10 1.5
;
run;

So essentially looking for what part of the three years the halfway point of the sales occurred.

Appreciate.

EDIT: what I've been trying with the weight and proc means

I've been trying to use proc means and weight function but it doesn't give me the average point of the three years

proc means data=test noprint;
var one two three;
var one+two+three=total;
var (one+two+three)/3=Average; 
var Average/weight=Average_Year;

output out=testa2
    sum(Total) = 
    mean(Total) = ;
run;
78282219
  • 593
  • 5
  • 21
  • Can you please include what you've tried so far? This seems fairly straightforward in a data step. – Joe Mar 19 '18 at 15:35
  • I believe the second row `20 30 40 1.5` is mis-stated. The rolling sum midpoint would be at 45/50 representing index 1.833. (20/20 + 25/30) – Richard Mar 20 '18 at 12:33

1 Answers1

0

I think your second example is wrong and the correct value for average is actually 1.833 rather than 1.5. If I've got that right, the following data step code does what you need:

data want;
  set test;
  array years[3] one two three;
  total = one + two + three;
  midpoint = total / 2;
  do i = 1 by 1 until(cum_total >= midpoint);
    cum_total = sum(cum_total,years[i]);
  end;
  average = i - 1 + (midpoint - (cum_total - years[i]))/years[i];
run;

I think it would be difficult to reproduce this logic via proc means as your average doesn't directly correspond to any well-known statistic that I'm aware of. It's more like some sort of weighted median with uniform pro-rating.

user667489
  • 9,501
  • 2
  • 24
  • 35