-1

I would like to know how to access date column from a table and use it as date filter for another large volume table..

I have the following query that currently uses sysdate and it gets completed in twenty minutes.

select a,b,datec, sum(c) from table1 
where datec = sysdate - 1 group by a,b,datec

I'm trying to replace sysdate with a date from another table called table2. It's a small table with 1600 rows that just returns latest cycle completion date (one value) which is typically sysdate - 1 for most days except for holidays.table1 has millions of rows.

I tried the following query in order to get the date value in the filter condition:

select a,b,datec, sum(c) from table1 t1, table2 t2 where t1.datec = t2.pdate and t2.prcnm = 'TC' group by a,b,datec

select a,b,datec, sum(c) from table1 t1 inner join table2 t2 on datec = t2.pdate and t2.prcnm = 'TC' group by a,b,datec

select a,b,datec, sum(c) from table1 t1 where t1.datec = (SELECT t2.date FROM table2 t2 WHERE prcnm = 'TC') group by a,b,datec

The above queries take too long and eventually fail with this error message - "parallel query server died unexpectedly"

I would like to know how to get the date from table2 and use it in table1 without running in to performance issues. I am not able to view the plan because table1 is a view and I don't have access to underlying objects.

thanks.

CuriP
  • 83
  • 10
  • In the second query, why do you have an **outer** join? That doesn't correspond to the other queries (your original one, or the two alternatives). Other than that, I suspect that the optimizer is trying to do fancy things, when what you want is for it to compute the date from the subquery in the last version of your code, and then to use that date exactly as it is using it in your current query (based on sysdate). But, one question first: is it ever possible that the subquery (from `t2`) will ever return anything other than **exactly** one value? –  Jan 07 '21 at 18:57
  • I should have added - one thing you can try is the `/*+ no_unnest */` hint in the subquery in the `where` clause (to prevent the optimizer from doing the "fancy things" I mentioned in my earlier comment). –  Jan 07 '21 at 19:08
  • it won't return more than one value and it is not returning more than one value too. I also double checked that it is indeed returning only one value by directly by querying t2 with that condition.. sorry, it shouldn't have left join.. let me correct it. I presume it may not help even if I make it as inner join. I am not even able to return even 10 rows with any of those options. Let me try no_unnest option and let you know. thanks for your inputs. – CuriP Jan 07 '21 at 22:50
  • i tried this - select a,b,datec, sum(c) from table1 t1 where t1.datec = (SELECT /*+ no_unnest qb_name(subq1) */ t2.date FROM table2 t2 WHERE prcnm = 'TC') group by a,b,datec - it didn't make any difference.. they query keeps running on. should I be using it differently or is there any other hint that could help on this? it is quite strange that not even ten rows are returned from the query.. is it possible that it is a bug in Oracle or something like that? – CuriP Jan 08 '21 at 00:19
  • When you put the hint right in the subquery that should not be unnested, you don't need a "query block name". Adding the `qb_name(subq1)` hint (the way you wrote it, it is a second hint, separate from `no_unnest`, and it's OK as it is) was not necessary, but it shouldn't hurt anything either. So - even with the hint the query is taking long... then it must be some odd coincidence that the first version (with `sysdate`) works much faster. What happens if in the `with` clause you use a hard-coded date, instead of the `select` subquery? Such as, `where t1.datec = date '2021-01-03'` –  Jan 08 '21 at 01:53
  • 1
    It's unreasonable for your organization to expect you to tune a query without access to the explain or execution plans. It's like saying, "Make this program run fast without knowing the algorithms or data structures." We could make wild guesses, like try adding `/*+ no_parallel */` or `where rownum = 1`, but without the plans we don't know what's going on and why it worked. Although you've clearly put effort into this question, it won't be useful for anyone else. – Jon Heller Jan 08 '21 at 02:21
  • @mathguy - explicitly hardcoding the date like you mentioned works without any issues. the query gets completed and rows return. – CuriP Jan 08 '21 at 16:05

1 Answers1

0

To me, your 2nd query:

select a.a, a.b, a.datec, sum(a.c)
from table1 a join table2 b on a.datec = b.pdate
where b.prcnm = 'TC'
group by a.a, a.b, a.datec;

with indexes on table1.datec, table2.pdate and table2.prcnm might be a starting point.

Review explain plan, see what it says. Gather statistics on both tables and all indexes.

Littlefoot
  • 131,892
  • 15
  • 35
  • 57
  • that query you mentioned returns the error -"parallel query server died unexpectedly". we are running the query in external system, so no scope of adding any indexes. I am unable to look up the query plan because we don't have access to underlying objects of the view table1. the thing that's not clear is why it runs without any issues with sysdate as filter condition, but not with the same date from a small table and give parallel query error. – CuriP Jan 07 '21 at 19:33