2

Cross apply is very slow when inserting large data sets, I think unpivot (or something else) might be more efficient but I can't figure out how to do it other than by using cross apply:

CREATE TABLE LocData
    (Airport varchar(5), City varchar(5), Country varchar(5))
;

INSERT INTO LocData
    (Airport, City, Country)
VALUES
    ('LHR','LON','GB'),
    ('LGW','LON','GB'),
    ('LCY','LON','US'),
    ('JFK','NYC','US'),
    ('LGA','NYC','US'),
    ('EWR','NYC','US')
;

CREATE TABLE Travel
    (ID int, Dep varchar(5), Arr varchar(5))
;

INSERT INTO Travel
    (ID, Dep, Arr)
VALUES
    (1, 'LHR','JFK'),
    (2, 'LHR','EWR'),
    (3, 'LGA','LGW'),
    (4, 'LCY','EWR')
;

--the query

select * from Travel;

select 
    c.*
    from Travel t
    join LocData dep on dep.Airport=t.Dep
    join LocData arr on arr.Airport=t.Arr
    cross apply
    (
    values 
    (t.ID, dep.Airport, 0, arr.Airport, 0),
    (t.ID, dep.City, 1, arr.City, 1),
    (t.ID, dep.Country, 2, arr.Country, 2)
    ) c (ID, Dep, DepType, Arr, ArrType);

You can test it here (SQLFiddle)

Shnugo
  • 66,100
  • 9
  • 53
  • 114
baileyswalk
  • 1,198
  • 2
  • 17
  • 29
  • 1
    I just wanted to write something rude, because your question looked so extremely poor, but then I saw the fiddle link... I entered the code into the question, which will - for sure - bring you much better answers... – Shnugo Aug 19 '17 at 10:01
  • 1
    Are there any indexes / foreign keys? Are you really using the *3-char* airport code as the FK? What do you mean with *is very slow* and what are *large data sets*? – Shnugo Aug 19 '17 at 10:05
  • Thanks @Shnugo wasn't sure if I should provide both, will do in the future. – baileyswalk Aug 19 '17 at 21:45

1 Answers1

2

Your query is:

select c.*
from Travel t join
     LocData dep
     on dep.Airport = t.Dep join
     LocData arr
     on arr.Airport = t.Arr cross apply
     (values (t.ID, dep.Airport, 0, arr.Airport, 0),
             (t.ID, dep.City, 1, arr.City, 1),
             (t.ID, dep.Country, 2, arr.Country, 2)
    ) c(ID, Dep, DepType, Arr, ArrType);

This seems like a very efficient way to do what you want to do. It should have good performance.

One thing that will improve performance is an index on Locadata(airport). You can also include the columns city and country (either include them or make them additional keys). I would define the table using it as a primary key (which automatically creates the best type of index):

CREATE TABLE LocData (
    Airport varchar(5) PRIMARY KEY,
    City varchar(5),
    Country varchar(5)
);

(Making the reasonable assumption that it is never NULL and never duplicated.)

In other words, I sincerely doubt that the apply affects performance. The joins are much more likely to be the culprit.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • You are right, the issue was nothing to do with the cross apply, it was down to a further operation outwith what I posted here. Now inserting 3mil+ in sub 40secs (was 19 minutes before). Happy days, false alarm. – baileyswalk Aug 19 '17 at 21:53