3

I have the following reservation data:

Site Reservation Arrival Departure Category
AIRBNB_1 1234 2023-01-01 2023-01-02 GUEST
AIRBNB_1 1234 2023-01-02 2023-01-03 GUEST
AIRBNB_1 1234 2023-01-03 2023-01-07 OWNER
AIRBNB_2 5678 2023-01-15 2023-01-17 GUEST
AIRBNB_3 9123 2023-02-01 2023-02-02 COMP

I need to add a new "Umbrella Category" that takes into account the Category column across the Site + Reservation.

Site Reservation Arrival Departure Category Umbrella Category
AIRBNB_1 1234 2023-01-01 2023-01-02 GUEST OWNER
AIRBNB_1 1234 2023-01-02 2023-01-03 GUEST OWNER
AIRBNB_1 1234 2023-01-03 2023-01-07 OWNER OWNER
AIRBNB_2 5678 2023-01-15 2023-01-17 GUEST GUEST
AIRBNB_3 9123 2023-02-01 2023-02-02 COMP COMP

Unfortunately, I cannot group because I need all records. Is there a function or possible partition that would allow me to assign an umbrella category based on the category column? I have the below query, but it doesn't allow me to take into account Site + Reservation columns (which is needed because the first three lines are one reservation. Any insight is much appreciated.

select
    [Site],
    Reservation,
    Arrival,
    Departure,
    Category,
    case 
        when Category in ('GUEST', 'OWNER') then 'OWNER'
        when Category in ('OWNER', 'COMP') then 'OWNER'
        when Category in ('GUEST', 'COMP') then 'GUEST'
        else Category end as [Umbrella Category]
from #Reservations

Supafly
  • 57
  • 3
  • 1
    _"Unfortunately, I cannot group because I need all records"_ - you can, just use a CTE. – Dai Jul 11 '23 at 12:38
  • 2
    Your posted `CASE WHEN...` logic does not correspond to your expected-results: when will `COMP` be returned? Why are you checking for `'OWNER'` twice? – Dai Jul 11 '23 at 12:39
  • 1
    Please explain the structure and contents of your `#Reservations` table. Your post suggests that the `#Reservations` table is denormalized - if so, please describe the normalized source data (and if you can, or cannot, just reference that directly). Also, may I ask why you are using a `#TemporaryTable` here? – Dai Jul 11 '23 at 12:41
  • 1
    It's a large table, thus the reason I created a small temp table to demonstrate some example records/scenarios. In the case of owner, there is a priority which owner wins if that category is listed with any other category. Does that make sense? And to answer the comp question, it would only be comp if the reservation is a single line with a category = comp. – Supafly Jul 11 '23 at 12:44
  • 2
    I believe I understand your logic. Essentially you want to look at a group/partition of records that share the same `site` and `reservation` and then if that group has both `GUEST` and `OWNER` categories, then the umbrella is `OWNER`. Please confirm if that is correct. – JNevill Jul 11 '23 at 12:46
  • BAsed on your last comment, would it be safe to say that in a group/partition of records that `Owner` wins over `Guest` and `Comp`, and that `Guest` wins over `Comp` and then `Comp` only wins when it's by itself? – JNevill Jul 11 '23 at 12:47
  • I think the real problem here is that your `#Reservations` data is denormalized, which is why you're having problems trying to work with it - is there any way you can get _normalized_ data, with actual primary-keys? – Dai Jul 11 '23 at 12:48
  • My apology! When I was building the table in my original post, I forgot to update the arrival/departure dates! No wonder it was confusing. If you look at reservation 1234, it's actually a single reservation (likely because the owner had two friends who stayed one night each). Hopefully that makes sense. – Supafly Jul 11 '23 at 12:53
  • JNevill, you got it! – Supafly Jul 11 '23 at 12:57
  • It's a strange schema, but yes you can use window queries with partitioning, by the reservation value. The problem is in the ordering and the concept of "Umbrella" Why does "OWNER" win over "GUEST" and "COMP"? In this case we can use a simple sort, but ideally your `Category` should represent a lookup in another table that provides metadata or a specific value to determine the correct lookup logic. – Chris Schaller Jul 11 '23 at 12:58

4 Answers4

3

From your sample data it looks like you could simply use a windowed aggregate of categories:

select *, Max(category) over(partition by Site, Reservation) Umbrella_Category
from #Reservations;
Stu
  • 30,392
  • 6
  • 14
  • 33
2

Consider a new table to hold the ranking of your categories so you can determine which category wins in a group/partition:

category:

+----------+------+
| category | rank |
+----------+------+
| OWNER    |    1 |
| GUEST    |    2 |
| COMP     |    3 |
+----------+------+

Then you can join to this to get your umbrella rankings:

SELECT
    res2.Site,
    res2.Reservation,
    res2.Arrival,
    res2.Departure,
    res2.Category,
    cat2.Category as Umbrella
FROM 
   (
      SELECT cat.Site, 
        cat.Reservation, 
        cat.Arrival, 
        cat.Departure, 
        cat.Category, 
        MIN(res.rank) OVER (PARTITION BY Site, Reservation) as Umbrella_cat_rank
      FROM Reservations res
       INNER JOIN categories cat 
          ON res.Category = cat.category
   ) res2 
   INNER JOIN category cat2 
       ON res2.Umbrella_cat_rank = cat2.rank

While the SQL is a bit more involved with the subquery and two joins, this gives you a lot of flexibility if in the future you add more categories or need to change ranks. You merely INSERT/UPDATE into your category table as needed and this sql will bring back the correct umbrella category.

JNevill
  • 46,980
  • 4
  • 38
  • 63
1

Try the following:

SELECT
    [Site],
    Reservation,
    Arrival,
    Departure,
    Category,
    MAX(CASE
            WHEN Category IN ('GUEST', 'OWNER') THEN 'OWNER'
            WHEN Category IN ('COMP', 'OWNER') THEN 'OWNER'
            WHEN Category IN ('GUEST', 'COMP') THEN 'GUEST'
            ELSE Category
        END) OVER (PARTITION BY [Site], Reservation) AS [Umbrella Category]
FROM
    #Reservations
Amira Bedhiafi
  • 8,088
  • 6
  • 24
  • 60
  • This is not a good example of `CASE`. `WHEN`...`IN` is only good for mutually exclusive groups, in this case all `'COMP'` values will be converted to `'OWNER'` and all `'GUEST'` values will also be converted to `'OWNER'` making all records `'OWNER'`. This answer therefor confuses people about how `CASE` in a window query works. – Chris Schaller Jul 11 '23 at 13:05
1

My approach performs the GROUP BY in a CTE (grouped) and each group is examined in a second CTE (umbrellaCategories), where the "Umbrella Category" values are determined exhaustively from separate filtered COUNT expressions in a single CASE... expression. The advantage of this approach is that if the source-data contains unexpected rows (e.g. 2 or more 'COMP' rows for a single Site+Reservation) then that can be explicitly detected and handled, if necessary:

DBFiddle link.

/*
CREATE TABLE reservations (
  
  Site        nvarchar(50) NOT NULL,
  Reservation int          NOT NULL,
  Arrival     date         NOT NULL,
  Departure   date         NOT NULL,
  Category    nvarchar(50) NOT NULL
);

INSERT INTO reservations ( Site, Reservation, Arrival, Departure, Category ) VALUES
( 'AIRBNB_1', 1234, '2023-01-01', '2023-01-02', 'GUEST' ),
( 'AIRBNB_1', 1234, '2023-01-02', '2023-01-03', 'GUEST' ),
( 'AIRBNB_1', 1234, '2023-01-03', '2023-01-07', 'OWNER' ),
( 'AIRBNB_2', 5678, '2023-01-15', '2023-01-17', 'GUEST' ),
( 'AIRBNB_3', 9123, '2023-02-01', '2023-02-02', 'COMP'  );
*/

WITH grouped AS (

  SELECT
      r.Site,
      r.Reservation,
      COUNT( * ) AS CountRows_All,
      COUNT( CASE r.Category WHEN 'OWNER' THEN 1 END ) AS CountRows_Owner,
      COUNT( CASE r.Category WHEN 'GUEST' THEN 1 END ) AS CountRows_Guest,
      COUNT( CASE r.Category WHEN 'COMP'  THEN 1 END ) AS CountRows_Comp,
      COUNT( CASE WHEN r.Category <> 'OWNER' THEN 1 END ) AS CountRows_NonOwner,
      COUNT( CASE WHEN r.Category NOT IN ( 'GUEST', 'OWNER', 'COMP' ) THEN 1 END ) AS CountRows_Other
  FROM
      reservations AS r
  GROUP BY
      r.Site,
      r.Reservation
),
umbrellaCategories AS (

    SELECT
      g.Site,
      g.Reservation,
    
      CASE
          /* "owner wins if that category is listed with any other category" */
          WHEN g.CountRows_Owner > 0 AND g.CountRows_NonOwner > 0 THEN 'Owner'
          /* " it would only be comp if the reservation is a single line with a category = comp" */
          WHEN g.CountRows_Comp = 1 AND g.CountRows_All = 1 THEN 'Comp'
          WHEN g.CountRows_Guest > 0 THEN 'Guest' ELSE 'Other'
      END AS "Umbrella Category"
    FROM
      grouped AS g
)
SELECT
    r.Site,
    r.Reservation,
    r.Arrival,
    r.Departure,
    r.Category,
    uc."Umbrella Category"
FROM
    reservations AS r
    INNER JOIN umbrellaCategories AS uc ON
       r.Site = uc.Site
       AND
       r.Reservation = uc.Reservation
ORDER BY
    r.Site,
    r.Reservation;

Gives me these results:

enter image description here

Dai
  • 141,631
  • 28
  • 261
  • 374