Task: select sportsmen who participate in at least 2 competitions in a row (2 competitions go one after another; 1-2-3-4-5: 2&4 or 1&3&5 are not ok, 1&2 is ok, 1&2&3 is ok, 1&2 and 4&5 is ok). Question: find the best way (faster, less resources)
Working table:
There is single hold_date for each competition_id.
There is only one result for each sportsman_id per competition_id.
This works fine for 25 rows in result table:
SELECT DISTINCT sportsman_id, sportsman_name, rank, year_of_birth, personal_record, country
FROM
(
SELECT sportsman_id, hold_date,
LAG (comp_order, 1) OVER (PARTITION BY sportsman_id ORDER BY sportsman_id) prev_comp_number
, comp_order
FROM result
INNER JOIN
(
SELECT hold_date, ROW_NUMBER() OVER (ORDER BY hold_date) AS comp_order
FROM
(
SELECT DISTINCT hold_date
FROM result
)
) USING (hold_date)
ORDER BY sportsman_id, comp_order
)
INNER JOIN sportsman USING (sportsman_id)
WHERE comp_order-prev_comp_number=1
;
screenshot of code with comments:
sample data:
result of code above (=desired result)
Let's assume there are millions of rows (thousands of competitions and thousands of sportsmen). How reliable my code is?
What I think is decreasing the number of rows by excluding the rows if sportsman_id occurs just once (if sportsman took part (got result) at only 1 competition he obviously can't be the one). Something like this: (haven't implement tho (dunno how or most likely when/where))
SELECT re.hold_date, r.sportsman_id
FROM result r
INNER JOIN result re ON (re.sportsman_id=r.sportsman_id)
GROUP BY r.sportsman_id, re.hold_date
HAVING COUNT(r.sportsman_id) > 1
;
Then, I guess with LAG I only double existing column which is kinda fine?
Is there easier way with using PLSQL? Or there is a function which does some part of my code?