1

I am working to shift calculations being performed in Tableau to be performed in an underlying SAS query and I am having a hard time recreating the Tableau Level of Detail function FIXED. Here's an example: { FIXED [ID_Field], [Group]: MAX([Value_Field]) } The tableau calculation is saying Find the MAX of the Value_Field for each individual's ID_Field in each Group

I tried to replicate this using a subquery so my query looked and operated something like this

Proc SQL; Create table XYZ.Tableau_Calcs AS

SELECT 
Value Field
,ID_Field
,(Select MAX(Value_Field) 
 From XYX.Original_Calcs 
 group by ID_Field, Group) as SAS_Calc
,Group
,Date
,Flag
From XYX.Original_Calcs

However, I got an error message "ERROR: Subquery evaluated to more than one row." Apparently that means it's producing more than one line

Does anyone know how to fix this error? Here's some data that has the same structure as the data I'm dealing with at work.

Value_Field ID_Field Group Date Flag
56 25193T4010 H5 10/31/2018 1
56 25193T4010 H5 1/28/2019 1
38 25193T4010 N9 2/9/2019 0
38 73437H0904 E3 7/6/2017 1
38 73437H0904 E3 3/14/2017 1
48 73437H0904 H5 8/14/2018 1
48 73437H0904 H5 10/15/2018 1
66 91641V2912 H5 10/1/2018 0
66 91641V2912 H5 11/1/2019 1
98 91641V2912 N9 3/1/2019 1
98 91641V2912 N9 3/1/2019 1
Joe
  • 62,789
  • 6
  • 49
  • 67
D_man
  • 113
  • 3
  • 9
  • What is it that you want that is different than the input? Perhaps you need to post better sample data where MAX(VALUE_FIELD) is going to be different than VALUE_FIELD on at least on observation. – Tom Aug 17 '21 at 22:30
  • Your SQL sub query groups by ID_FIELD and GROUP but the Tableau only indicates ID_FIELD so which is correct? – Reeza Aug 17 '21 at 23:19

2 Answers2

3

PROC SQL will let you calculate an aggregate function and keep the detailed observations. It will and automatically re-merge the summary values onto all of the detailed observations for that group.

data have;
  input value_field id_field :$10. group $ date :mmddyy. flag;
  format date yymmdd10.;
cards;
56 25193T4010 H5 10/31/2018 1
56 25193T4010 H5 1/28/2019 1
38 25193T4010 N9 2/9/2019 0
38 73437H0904 E3 7/6/2017 1
38 73437H0904 E3 3/14/2017 1
48 73437H0904 H5 8/14/2018 1
48 73437H0904 H5 10/15/2018 1
66 91641V2912 H5 10/1/2018 0
66 91641V2912 H5 11/1/2019 1
98 91641V2912 N9 3/1/2019 1
98 91641V2912 N9 3/1/2019 1
;

proc sql;
create table want as
  select
      id_field
    , group
    , max(value_field) as sas_calc
    , value_field
    , date
    , flag
  from have
  group by id_field, group
;
quit;

Note for your sample data the new SAS_CALC variable is exactly the same as the current VALUE_FIELD on every observation.

Tom
  • 47,574
  • 2
  • 16
  • 29
  • I don't think this should include `group` in the group by to match the Tableau function. It's only grouped by ID_FIELD. – Reeza Aug 17 '21 at 23:18
1

I think you can modify your query as follows in SAS (or other DB) to get it working but @Tom's answer is the better SAS answer for you.

You do need to add alias and you need to have the subquery run for each ID and group so you need to include that via a WHERE clause so it only returns the correct values for each row, not for the whole data set. Adding the WHERE limits the results to the line of interest.

Proc SQL; 
Create table XYZ.Tableau_Calcs AS
SELECT 
Value_Field
,ID_Field
,(Select MAX(Value_Field) 
 From XYX.Original_Calcs as T1_sub
 where t1_sub.ID_FIELD=t1.ID_FIELD and t1_sub.group=t1.group
 group by ID_Field, Group) as SAS_Calc
,Group
,Date
,Flag
From XYX.Original_Calcs as T1;
quit;

There are at least two other ways to do this in SAS - one is to create the summary statistics separately and merge them in. And a second is a DoW loop if you need efficiency.

Reeza
  • 20,510
  • 4
  • 21
  • 38