1

I have a sample dataset named Flights.I want to Extract from it the Origin Airport name which has least number of departure delays.

Sample Flights data:-

Date     (Sched_dep_time) (dep_time)(flight)(origin)  (Dep_delay_min)
01-01-2013  5:15             5:17   1545    EWR       -2
01-01-2013  5:29             5:33   1714    LGA       -4
01-01-2013  5:40             5:42   1141    JFK       -2
01-01-2013  21:10           21:04   725     JFK        6
01-01-2013  20:30           21:04   461     LGA      -74
01-01-2013  21:06           21:05   1696    EWR        1
01-01-2013  20:55           21:10   507     EWR      -55
01-01-2013  20:25           21:14   5708    LGA      -89
01-01-2013  21:10           21:15   79      JFK       -5
01-01-2013  21:24           21:16   301     LGA        8
01-01-2013  6:00             5:58   49      JFK        42
01-01-2013  6:00             5:58   71      JFK        42
01-01-2013  6:00             5:58   194     JFK        42

Code i Tried: -

Proc sql;                                                                                                                               

Create table least_delay as                                                                                                         

Select origin,min(number_of_delays)as min_delay from                                                                                                

(Select Origin,Count(Departure_delay_minutes) as Number_of_delays from 
Flight                                                       
Where (Departure_delay_minutes>0))                                                                                                      

Group by Origin                                                                                                                         
;                                                                                                                                       
Quit;    

The output i get is following: -

    Origin  min_delay
1   NLI      1135504
2   JFK      1135504
3   LGA      1135504

It shows same result for all the origin!

Can anybody help me on this?

Aman kashyap
  • 143
  • 1
  • 3
  • 12

1 Answers1

2

The specific problem in your code is that you need to add a group by Origin clause in the sub query. However, all this would do is return the number of delays for each Origin, not the Origin(s) with the least delay. A small change to the code, adding a having clause, fixes this.

data flight;
input Date :ddmmyy10. (Sched_dep_time dep_time) (:time.) flight origin $ Dep_delay_min;
format date date9. Sched_dep_time dep_time time. ;
datalines;
01-01-2013  5:15             5:17   1545    EWR       -2
01-01-2013  5:29             5:33   1714    LGA       -4
01-01-2013  5:40             5:42   1141    JFK       -2
01-01-2013  21:10           21:04   725     JFK        6
01-01-2013  20:30           21:04   461     LGA      -74
01-01-2013  21:06           21:05   1696    EWR        1
01-01-2013  20:55           21:10   507     EWR      -55
01-01-2013  20:25           21:14   5708    LGA      -89
01-01-2013  21:10           21:15   79      JFK       -5
01-01-2013  21:24           21:16   301     LGA        8
01-01-2013  6:00             5:58   49      JFK        42
01-01-2013  6:00             5:58   71      JFK        42
01-01-2013  6:00             5:58   194     JFK        42
;
run;

proc sql;
create table least_delay
as select *
from (
  select
    origin,
    count(0) as num_delays
  from
    flight
  where
    dep_delay_min>0
  group by
    origin
    )
having num_delays = min(num_delays);
quit;
Longfish
  • 7,582
  • 13
  • 19