0

Between the following two queries

  SELECT 
  CASE WHEN col1=1  THEN "one"
       WHEN col1=2  THEN "two"
       ELSE "whatever" 
       END AS numb from TABLE1 WHERE col1=1 or col1=2


  SELECT 
  CASE WHEN col1=1  THEN "one"
       WHEN col1=2  THEN "two"
       ELSE "whatever" 
       END AS numb from TABLE1

Are there any significant differences between two in terms of time complexity? I know the output table for the first table will be smaller due to the condition under WHERE filter, but would it also take more time due to the WHERE filter?

I wanna know how much time WHERE filter would consume.

Any help would be greatly appreciated.

user98235
  • 830
  • 1
  • 13
  • 31
  • Depends on the contents of TABLE1. – tinazmu Mar 08 '23 at 02:57
  • @tinazmu could you elaborate? how is TABLE1 contains a lot of columns vs. TABLE1 contains col1 and just a few others different? – user98235 Mar 08 '23 at 03:25
  • It is more about the distribution of values across rows. If the table has millions of rows with only a few rows with col1=1 or col1=2 then the first version would work better (CPU effort to check for the condition pays off because you don't have to retrieve unwanted/irrelevant rows), if only a few rows have not (col1=1 or col1=2) then the other version works better (ie the effort to eliminate non-matching rows has little return). It also depends on if you have an index on Col1. – tinazmu Mar 08 '23 at 03:35
  • If there is an index on col1 then the first query will be significantly faster. If there isn’t then both would require a full table scan and would take approximately the same amount of time – NickW Mar 08 '23 at 07:51

1 Answers1

0

Are there any significant differences between two in terms of time complexity?

Both are O(n) in terms of time complexity - they perform constant amount of work per row (since Presto/Trino is not actually a database and does not maintain/support indexes).

but would it also take more time due to the WHERE filter

Yes, but I would argue that in real life scenarios it should be quite insignificant compared to the rest of the processing (if the filter is "useless" - i.e. (almost)all the data should be present in the output though it potentially can be better to skip it of course).

I wanna know how much time WHERE filter would consume.

Use EXPLAIN ANALYZE on your data and query to see the cost in your case.

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
  • could you also take a look at https://stackoverflow.com/questions/75680237/creating-a-new-column-using-existing-columns-but-in-non-exclusive-conditions ? – user98235 Mar 09 '23 at 04:24
  • @user98235 done =) Also note that you have not specified SQL engine in it. – Guru Stron Mar 09 '23 at 05:20