1

Table Structure

I have a table similar to the following:

venues

The following table describes a list of businesses

id    name
50    Nando's
60    KFC

rewards

The table describes a number of rewards, the venue it corresponds to and the number of points needed to redeem the reward.

id    venue_id    name        points
1     50          5% off      10
2     50          10% off     20
3     50          11% off     30
4     50          15% off     40
5     50          20% off     50
6     50          30% off     50
7     60          30% off     70
8     60          60% off     100
9     60          65% off     120
10    60          70% off     130
11    60          80% off     140

points_data

The table describes the number of points remaining a user has for each venue.

venue_id    points_remaining
50           30
60          90

Note that this query is actually computed within SQL like so:

select * from (
  select venue_id, (total_points - points_redeemed) as points_remaining
  from (
         select venue_id, sum(total_points) as total_points, sum(points_redeemed) as points_redeemed
         from (
                (
                  select venue_id, sum(points) as total_points, 0 as points_redeemed
                  from check_ins
                  group by venue_id
                )
                UNION
                (
                  select venue_id, 0 as total_points, sum(points) as points_redeemed
                  from reward_redemptions rr
                    join rewards r on rr.reward_id = r.id
                  group by venue_id
                )
              ) a
         group by venue_id
       ) b
  GROUP BY venue_id
) points_data

but for this question you can probably just ignore that massive query and assume the table is just called points_data.

Desired Output

I want to get a single query that gets:

  • The top 2 rewards the user is eligible for each venue
  • The lowest 2 rewards the user is not yet eligible for for each venue

So for the above data, the output would be:

id    venue_id    name        points
2     50          10% off     20
3     50          11% off     30
4     50          15% off     40
5     50          20% off     50
7     60          30% off     70
8     60          60% off     100
9     60          65% off     120

What I got so far

The best solution I found so far is first getting the points_data, and then using code (i.e. PHP) to dynamically write the following:

(
  select * from rewards
  where venue_id = 50
  and points > 30
  ORDER BY points desc
  LIMIT 2
)
union all
(
  select * from rewards
  where venue_id = 50
        and points <= 30
  ORDER BY points desc
  LIMIT 2
)
UNION ALL
(
  select * from rewards
  where venue_id = 60
        and points <= 90
  ORDER BY points desc
  LIMIT 2
)
UNION ALL
(
  select * from rewards
  where venue_id = 60
        and points > 90
  ORDER BY points desc
  LIMIT 2
)
ORDER BY venue_id, points asc;

However, I feel the query can get a bit too long and in-efficient. For example, if a user has points in 400 venues, that is 800 sub-queries.

I tried also doing a join like so, but can't really get better than:

select * from points_data
INNER JOIN rewards on rewards.venue_id = points_data.venue_id
where points > points_remaining;

which is far from what I want.

halfer
  • 19,824
  • 17
  • 99
  • 186
Yahya Uddin
  • 26,997
  • 35
  • 140
  • 231
  • Why do some of your subqueries have "30" for points and others have "90"? – Gordon Linoff Sep 02 '18 at 19:10
  • It comes from the points data table – Yahya Uddin Sep 02 '18 at 19:11
  • 1
    I think the `venue_id`s in `points_data` might be wrong? They don't seem to reflect the IDs from `venues`? – sticky bit Sep 02 '18 at 19:16
  • Sorry for the confusion. Editing question – Yahya Uddin Sep 02 '18 at 19:17
  • What MySQL version are you using? – Mureinik Sep 02 '18 at 19:50
  • You need to combine greatest-n-group, lowest-n-group and ranking.. I believe all methods have alot off (duplicated) questions on stackoverflow on how to do it on MySQL versions less then 8.0 and on MySQL versions 8.0+ – Raymond Nijland Sep 02 '18 at 20:06
  • Also your expected results does not match the requirements it seams.. "The top 2 rewards" and "The lowest 2 rewards".. i assume that would generate 8 records instead of 7 records from the current data source. – Raymond Nijland Sep 02 '18 at 20:11
  • 1
    @RaymondNijland: There is only one reward that can be redeemed with 90 points for venue 60 and that is reward 7. That's why the top 2 are the top 1 here. – sticky bit Sep 02 '18 at 20:15
  • "There is only one reward that can be redeemed with 90 points for venue 60 and that is reward 7. That's why the top 2 are the top 1 here" O now i see the required long scrolling because off the long query didn't help out to better understand the question by the way if you know what i mean.. You need a constant large of a scroll to check the data source with the expected output and whats going on there – Raymond Nijland Sep 02 '18 at 20:25
  • @Mureinik I'm using MySQL 5.7 – Yahya Uddin Sep 02 '18 at 21:12

1 Answers1

2

Correlated subqueries counting the number of higher or lower rewards to determine the top or bottom entries are one way.

SELECT r1.*
       FROM rewards r1
            INNER JOIN points_data pd1
                       ON pd1.venue_id = r1.venue_id
       WHERE r1.points <= pd1.points_remaining
             AND (SELECT count(*)
                         FROM rewards r2
                         WHERE r2.venue_id = r1.venue_id
                               AND r2.points <= pd1.points_remaining
                               AND (r2.points > r1.points
                                     OR r2.points = r1.points
                                        AND r2.id > r1.id)) < 2
              OR r1.points > pd1.points_remaining
                 AND (SELECT count(*)
                             FROM rewards r2
                             WHERE r2.venue_id = r1.venue_id
                                   AND r2.points > pd1.points_remaining
                                   AND (r2.points < r1.points
                                         OR r2.points = r1.points
                                            AND r2.id < r1.id)) < 2
       ORDER BY r1.venue_id,
                r1.points;

SQL Fiddle

Since MySQL 8.0 a solution using the row_number() window function would be an alternative. But I suppose you are on a lower version.

SELECT x.id,
       x.venue_id,
       x.name,
       x.points
       FROM (SELECT r.id,
                    r.venue_id,
                    r.name,
                    r.points,
                    pd.points_remaining,
                    row_number() OVER (PARTITION BY r.venue_id,
                                                    r.points <= pd.points_remaining
                                       ORDER BY r.points DESC) rntop,
                    row_number() OVER (PARTITION BY r.venue_id,
                                                    r.points > pd.points_remaining
                                       ORDER BY r.points ASC) rnbottom
                    FROM rewards r
                         INNER JOIN points_data pd
                                    ON pd.venue_id = r.venue_id) x
       WHERE x.points <= x.points_remaining
             AND x.rntop <= 2
              OR x.points > x.points_remaining
                 AND x.rnbottom <= 2
       ORDER BY x.venue_id,
                x.points;

db<>fiddle

The tricky part is here to partition the set also into the subset where the points of the user are enough to redeem the reward and the one where the points aren't enough, per venue. But as in MySQL logical expressions evaluate to 0 or 1 (in non Boolean context), the respective expressions can be used for that.

sticky bit
  • 36,626
  • 12
  • 31
  • 42
  • This works quite well. Although I am using MySQL 7, could you also show me the `row_number()` example anyway, so that when I upgrade. – Yahya Uddin Sep 02 '18 at 21:03
  • @YahyaUddin: I thought there were no version 7.x. There are some experimental versions 6.x. But the latest two stable versions are 5.7 and 8.0 AFAIK. But that aside, I edited the answer to also include the `row_number()` alternative. – sticky bit Sep 02 '18 at 21:39