1

I am junior web developer. I am studying about postgres, and I have a question about use case of case statement.

This query finds several rows with data that most closely matches a particular word.

The result can be multiple.

In this query, the ranking value is calculated using a case statement according to the similarity for each row.

It is understood that the case statement returns a value that matches each execution.

When the case statement is used inside the min and window functions, I wonder how to find the result.

the given table PostalCode is..

|---------------------|------------------|
|       pcode         |       name       |
|---------------------|------------------|
|       100456        |        a         |
|---------------------|------------------|
|       111343        |        b         |
|---------------------|------------------|
|       101334        |        c         |
|---------------------|------------------|
|       100567        |        d         |
|---------------------|------------------|
|       102234        |        e         |
|---------------------|------------------|

The query is..

SELECT pcode,
       name
  FROM (SELECT pcode,
               name,
                 CASE WHEN pcode = '100123' THEN 0
                      WHEN pcode = '10012%' THEN 1
                      WHEN pcode = '1001%'  THEN 2
                      WHEN pcode = '100%'   THEN 3
                      WHEN pcode = '10%'    THEN 4
                      WHEN pcode = '1%'     THEN 5
                      ELSE NULL END AS hit_code,    
                 MIN(CASE WHEN pcode = '100123' THEN 0 
                          WHEN pcode = '10012%' THEN 1
                          WHEN pcode = '1001%'  THEN 2
                          WHEN pcode = '100%'   THEN 3
                          WHEN pcode = '10%'    THEN 4
                          WHEN pcode = '1%'     THEN 5
                          ELSE NULL END)
                 OVER(ORDER BY CASE WHEN pcode = '100123' THEN 0
                                    WHEN pcode = '10012%' THEN 1
                                    WHEN pcode = '1001%'  THEN 2
                                    WHEN pcode = '100%'   THEN 3
                                    WHEN pcode = '10%'    THEN 4
                                    WHEN pcode = '1%'     THEN 5
                                    ELSE NULL END) AS min_code,
         FROM PostalCODE) Foo
WHERE hit_code = min_code;  

and the result will be

|---------------------|------------------|
|       pcode         |       name       |
|---------------------|------------------|
|       100456        |        a         |
|---------------------|------------------|
|       100567        |        d         |
|---------------------|------------------|

I know that the first case statement returns different values ​​depending on the similarity.

After case end, the MIN function is called, but what will it return?

I can not understand this context.

Please help me.

If you help me, I can't give you anything, but I would like to express my sincere gratitude.

Thanks for reading.

kk jj
  • 13
  • 4

2 Answers2

0

Your case expression checks the similarity between a given p_code and a fixed code, and returns hit_code, a value between 0 (highest similarity) and 5 (lowest similarity), or null (no similarity at all).

The window function min() over() does the very same operation over the whole dataset: it gives you the minimum hit_code for the whole table.

Finally, the outer query filters on the record(s) that have the minimum hit_code (ie the highest similarity).

Potential problems with the code:

  • the window function should have an empty over() clause (that order by does not help)

  • pattern matching requires operator like instead of =

Note that the code could be shorten by using a ranking function directly:

select pcode, name
from (
    select
        t.*,
        rank() over(
            order by case
                when pcode = '100123' then 0
                when pcode like '10012%' then 1
                when pcode like '1001%'  then 2
                when pcode like '100%'   then 3
                when pcode like '10%'    then 4
                when pcode like '1%'     then 5
                else null 
            end)
        as hit_rank
    from mytable t
) t
where hit_rank = 1
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Thank you for the reply. It helped a lot! Does this 'hit_rank = 1' statement mean that only a value with a hit_rank value of 1 will eventually be returned? – kk jj Mar 04 '20 at 19:16
  • @kkjj: just like the original query, this gives you the record that has the best similarity (ties included). I think that’s the simplest way to do what you want. You may try it against your real data. – GMB Mar 04 '20 at 19:27
0

If you want only one row, this would be so much simpler as:

SELECT pcode, name
FROM PostalCODE pc CROSS JOIN LATERAL
     (CASE WHEN pcode = '100123' THEN 0
           WHEN pcode = '10012%' THEN 1
           WHEN pcode = '1001%'  THEN 2
           WHEN pcode = '100%'   THEN 3
           WHEN pcode = '10%'    THEN 4
           WHEN pcode = '1%'     THEN 5
      END) v(hit_code)
ORDER BY hit_code ASC 
LIMIT 1;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786