0

Hi I want to know how can do a pivot table with crosstable in a table like:

user_id action time
1 a 2022-01-01 12:30
1 b 2022-01-01 12:40
1 b 2022-01-01 12:50
1 c 2022-01-01 13:00
1 c 2022-01-01 13:10
1 c 2022-01-01 13:20
2 b 2022-01-01 13:11
2 c 2022-01-01 13:21

The expected result should look like this:

user_id a b c
1 2022-01-01 12:30 2022-01-01 12:40 2022-01-01 13:00
1 NULL 2022-01-01 12:50 2022-01-01 13:10
1 NULL NULL 2022-01-01 13:20
2 NULL 2022-01-01 13:11 2022-01-01 13:21

Following several tutorials, they gave me this result:

user_id a b c
1 2022-01-01 12:30 2022-01-01 12:40 2022-01-01 13:00
2 NULL 2022-01-01 13:11 2022-01-01 13:21

However, as you can note, there is plenty of data loss by the single line result. Does any have an idea, please? Thanks in advance

1 Answers1

0

Here is the SQL:

with action_a
as
(
 SELECT user_id,
       Row_number()
         OVER (
           ORDER BY a) id,
       a
FROM   (SELECT user_id,
               CASE
                 WHEN action = 'a' THEN time
               END "a"
        FROM   tabaction) x 
),
action_b
as
(
  SELECT user_id,
       Row_number()
         OVER (
           ORDER BY b) id,
       b
FROM   (SELECT user_id,
               CASE
                 WHEN action = 'b' THEN time
               END "b"
        FROM   tabaction) x     
),
action_c
as
(
SELECT user_id,
       Row_number()
         OVER (
           ORDER BY c) id,
       c
FROM   (SELECT user_id,
               CASE
                 WHEN action = 'c' THEN time
               END "c"
        FROM   tabaction) x 

)
SELECT action_a.user_id,
       action_a.a,
       action_b.b,
       action_c.c
FROM   action_a
       FULL OUTER JOIN action_b
               ON action_a.id = action_b.id
                  AND action_a.user_id = action_b.user_id
       FULL OUTER JOIN action_c
               ON action_b.id = action_c.id
                  AND action_b.user_id = action_c.user_id; 

Output:

 user_id |          a          |          b          |          c
---------+---------------------+---------------------+---------------------
       1 | 2022-01-01 12:30:00 | 2022-01-01 12:30:00 | 2022-01-01 12:00:00
       1 |                     | 2022-01-01 12:50:00 | 2022-01-01 13:10:00
       1 |                     |                     | 2022-01-01 13:20:00
       1 |                     |                     |
       1 |                     |                     |
       1 |                     |                     |
(6 rows)

Setup:

create table tabaction
( user_id int, action varchar(5) , time timestamp);

insert into tabaction values(1,'a','2022-01-01 12:30'),(1,'b','2022-01-01 12:30'),(1,'b','2022-01-01 12:50'),(1,'c','2022-01-01 12:00'),(1,'c','2022-01-01 13:10'),(1,'c','2022-01-01 13:20');

postgres=# select * from tabaction;
 user_id | action |        time
---------+--------+---------------------
       1 | a      | 2022-01-01 12:30:00
       1 | b      | 2022-01-01 12:50:00
       1 | c      | 2022-01-01 12:00:00
       1 | c      | 2022-01-01 13:10:00
       1 | c      | 2022-01-01 13:20:00
       1 | b      | 2022-01-01 12:30:00
(6 rows)

Ideal SQL should include user_id for action A, B, C so that those not missed out . Here the situation does not arise because USER_ID is only 1 and all actions are mapped to USER_ID. I believe might get different, in which case you might need this SQL:

with action_a
as
(
 SELECT user_id,
       Row_number()
         OVER (
           ORDER BY a) id,
       a
FROM   (SELECT user_id,
               CASE
                 WHEN action = 'a' THEN time
               END "a"
        FROM   tabaction) x 
),
action_b
as
(
  SELECT user_id,
       Row_number()
         OVER (
           ORDER BY b) id,
       b
FROM   (SELECT user_id,
               CASE
                 WHEN action = 'b' THEN time
               END "b"
        FROM   tabaction) x   
),
action_c
as
(
SELECT user_id,
       Row_number()
         OVER (
           ORDER BY c) id,
       c
FROM   (SELECT user_id,
               CASE
                 WHEN action = 'c' THEN time
               END "c"
        FROM   tabaction) x 

)
SELECT action_a.user_id a_userid,
       action_b.user_id b_userid,
       action_c.user_id c_userid,
       action_a.a,
       action_b.b,
       action_c.c
FROM   action_a
       FULL OUTER JOIN action_b
               ON action_a.id = action_b.id
                  AND action_a.user_id = action_b.user_id
       FULL OUTER JOIN action_c
               ON action_b.id = action_c.id
                  AND action_b.user_id = action_c.user_id; 
rajorshi
  • 697
  • 4
  • 9
  • Thanks for the answer, however in the case that I am evaluating there are almost 12 user ids with 10 different actions. What makes this case bit difficult is the thing that non all the users performs the same actions. That is why I came across crosstable and I tried a different bunch of possibilities without success. I will try the SQL approach suggested and will come back with further comments. Regards – Alí Meres Vargas Feb 08 '22 at 15:19