I have a large dataset named Planes with missing values in Arrival Delays(Arr_Delay).I want to Replace those missing values by Average delay on the Specific route(Origin - Dest) by Specific Carrier.
Hereby is the sample of the dataset : -
date carrier Flight tailnum origin dest Distance Air_time Arr_Delay
01-01-2013 UA 1545 N14228 EWR IAH 1400 227 17
01-01-2013 UA 1714 N24211 LGA IAH 1416 227 .
01-01-2013 AA 1141 N619AA JFK MIA 1089 160 .
01-01-2013 EV 5708 N829AS LGA IAD 229 53 -18
01-01-2013 B6 79 N593JB JFK MCO 944 140 14
01-01-2013 AA 301 N3ALAA LGA ORD 733 138 .
01-01-2013 B6 49 N793JB JFK PBI 1028 149 .
01-01-2013 B6 71 N657JB JFK TPA 1005 158 19
01-01-2013 UA 194 N29129 JFK LAX 2475 345 23
01-01-2013 UA 1124 N53441 EWR SFO 2565 361 -29
code I tried : -
Proc stdize data=cs1.Planes reponly method=mean out=cs1.Complete_data;
var Arrival_delay_minutes;
Run;
But as my problem states..i want to get the mean by Specific Route and Specific Carrier for the Missing Value. Please help me on this!