Select unique combination of values (attributes) based on user_id

Question

I have a table that has user a user_id and a new record for each return reason for that user. As show here:

| user_id   | return_reason |
|---------  |-------------- |
| 1         | broken        |
| 2         | changed mind  |
| 2         | overpriced    |
| 3         | changed mind  |
| 4         | changed mind  |

What I would like to do is generate a foreign key for each combination of values that are applicable in a new table and apply that key to the user_id in a new table. Effectively creating a many to many relationship. The result would look like so:

Dimension Table ->

| reason_id     | return_reason     |
|-----------    |---------------    |
| 1             | broken            |
| 2             | changed mind      |
| 2             | overpriced        |
| 3             | changed mind      |

Fact Table ->

| user_id   | reason_id     |
|---------  |-----------    |
| 1         | 1             |
| 2         | 2             |
| 3         | 3             |
| 4         | 3             |

My thought process is to iterate through the table with a cursor, but this seems like a standard problem and therefore has a more efficient way of doing this. Is there a specific name for this type of problem? I also thought about pivoting and unpivoting. But that didn't seem too clean either. Any help or reference to articles in how to process this is appreciated.

Why is there one row less in the resultset that in the original data? — GMB, Sep 26 '20 at 17:59
The result set would have a mapping of attributes to users. Therefore the table should have one record per user. The `reason_id` is a foreign key in that table. In another table I will have a dimension table that expands on each foreign key. So for example In the dimension table, `reason_id` = 2 , would have 2 rows one for 'changed mind' and one for 'overpriced'. But they would share the same `reason_id`. Thus capturing an instance of these two reasons together — Anthony, Sep 26 '20 at 18:04
You desired results shows just one table. If you want to generate two tables (or more?), then please show us these tables too. — GMB, Sep 26 '20 at 18:07
Huh? You have repeated ids in the `reason` table. I just don't follow the logic. And why doesn't the "fact" table have five rows like the original data? — Gordon Linoff, Sep 27 '20 at 00:58

June7 · Answer 1 · 2020-09-27T17:15:20.770

The problem concerns data normalization and relational integrity. Your concept doesn't really make sense - Dimension table shows two different reasons with same ID and Fact table loses a record. Conventional schema for this many-to-many relationship would be three tables like:

Users table (info about users and UserID is unique)
Reasons table (info about reasons and ReasonID is unique)
UserReasons junction table (associates users with reasons - your existing table). Assuming user could associate with same reason multiple times, probably also need ReturnDate and OrderID_FK fields in UserReasons.

So, need to replace reason description in first table (UserReasons) with a ReasonID. Add a number long integer field ReasonID_FK in that table to hold ReasonID key.

To build Reasons table based on current data, use DISTINCT:
SELECT DISTINCT return_reason INTO Reasons FROM UserReasons

In new table, rename return_reason field to ReasonDescription and add an autonumber field ReasonID.

Now run UPDATE action to populate ReasonID_FK field in UserReasons.
UPDATE UserReasons INNER JOIN UserReasons.return_reason ON Reasons.ReasonDescription SET UserReasons.ReasonID_FK = Reasons.ReasonID

When all looks good, delete return_reason field.

Select unique combination of values (attributes) based on user_id

1 Answers1