Is there some equivalent to subquery correlation when making a derived table?

Question

I need to flatten out 2 rows in a vertical table (and then join to a third table) I generally do this by making a derived table for each field I need. There's only two fields, I figure this isn't that unreasonable.

But I know that the rows I want back in the derived table, are the subset that's in my join with my third table. So I'm trying to figure out the best derived tables to make so that the query runs most efficiently.

I figure the more restrictive I make the derived table's where clause, the smaller the derived table will be, the better response I'll get.

Really what I want is to correlate the where clause of the derived table with the join with the 3rd table, but you can't do that in sql, which is too bad. But I'm no sql master, maybe there's some trick I don't know about.

The other option is just to make the derived table(s) with no where clause and it just ends up joining the entire table twice (once for each field), and when I do my join against them the join filters every thing out.

So really what I'm asking I guess is what's the best way to make a derived table where I know pretty much specifically what rows I want, but sql won't let me get at them.

An example:

table1
------

id  tag     value
--  -----   -----
1   first   john
1   last    smith
2   first   sally
2   last    smithers


table2
------

id     occupation
--     ----------
1      carpenter
2      homemaker


select table2.occupation, firsttable.first, lasttable.last from
table2, (select value as first from table1 where tag = 'first') firsttable,
(select value as last from table1 where tag = 'last') lasttable
where table2.id = firsttable.id and table2.id = lasttable.id

What I want to do is make the firsttable where clause where tag='first' and id = table2.id

can you give some sample tables and results that can provide a good visual? — Derek, Jul 23 '12 at 12:46

Gulli Meel · Answer 1 · 2012-07-23T13:52:40.227

DERIVED tables are not to store the intermediate results as you expect. These are just a way to make code simpler. Using derived table doesnt mean that the derived table expression will be executed first and output of that will be used to join with remaining tables.Optimizer will automaticaly faltten derived tables in most of the cases.

However,There are cases where the optimizer might want to store the results of the subquery and thus materilize instead of flattening.It usually happens when you have some kind of aggregate functions or like that.But in your case the query is too simple and thus optimizer will flatten query

Also,storing derived table expression wont make your query fast it will in turn could make it worse.Your real problem is too much normalization.Fix that query will be just a join of two tables.

Why you have this kind of normalization?Why you are storing col values as rows.Try to denormalize table1 so that it has two columns first and last.That will be best solution for this.

Also, do you have proper indexes on id and tag column? if yes then a merge join is quite good for your query.

Please provide index details on these tables and the plan generated by your query.

Your query will be used like an inner join query.

select table2.occupation, first.valkue as first, last.value as last 
from 
table2
inner join table1  first 
on first.tag = 'first'
and first.id =table2.id
inner join table1 last
on last.tag = 'last'
and table2.id = last.id

my example is a very simple representation of the actual sql I'm working with. The real thing joins 8 or so tables and has lots and lots of conditions that are dynamic based on user filtering selections. There's no one query to optimize here. As for the normalizing the vertical table, well, yea, in an ideal world maybe I could do that, but it's a vertical table for a reason, there are many fields other than first and last and the number and type of those fields depends on various other user selection things. Can't be flattened into a set of columns in a table. — stu, Jul 23 '12 at 13:57
That is fine still the flattened query will be better than materialized.Can you please tell me what columns have indexes?Can you post the plan?Which version are you using?in case no plan post the statistics io,time details? — Gulli Meel, Jul 23 '12 at 14:50
very complicated and probably company secrets so I can't do that, sorry. — stu, Jul 23 '12 at 16:09

score 0 · Accepted Answer · answered Jul 23 '12 at 12:46

0

I think what you're asking for is a COMMON TABLE EXPRESSION. If your platform doesn't implement them, then a temporary table may be the best alternative.

answered Jul 23 '12 at 12:46

podiluska

50,950
7
98
104

I looked up CTE and you're right, that sounds like it. but based on this, sybase 15 doesn't support it... http://stackoverflow.com/questions/4803573/does-sybase-ase-12-5-support-common-table-expressions – stu Jul 23 '12 at 12:53
Then a temporary table may be your only alternative. Of course, the SQL optimizer in Sybase may be smart enough to only do the calculation once... have you checked? – podiluska Jul 23 '12 at 12:59
well after some looking (I only brought up this question because the query takes 15 seconds to run) I realized that sybase is doing a table scan on a rather large table, and when I remove a separate completely unrelated join the query runs in 5 seconds, even with the 2 derived tables, so now I'm thinking the derived tables aren't the problem, but I still always wondered if there was a better way to do a derived table, and it sounds like CTE is it. Thanks. – stu Jul 23 '12 at 13:29

score 0 · Answer 3 · answered Jul 23 '12 at 13:09

0

I'm a little confused. Your query looks okay . . . although it looks better with proper join syntax.

select table2.occupation, firsttable.first, lasttable.last
from table2 join
       (select value as first from table1 where tag = 'first') firsttable
       on table2.id = firsttable.id join
       (select value as last from table1 where tag = 'last') lasttable
       on table2.id = lasttable.id

This query does what you are asking it to do. SQL is a declarative language, not a procedural language. This means that you describe the result set and rely on the database SQL compiler to turn it into the right set of commands. (That said, sometimes how a query is structured does make it easier or harder for some engines to produce efficient query plans.)

answered Jul 23 '12 at 13:09

Gordon Linoff

1,242,037
58
646
786

The answer is that his query looks fine. I don't see what the question is. The condition is already in the query. – Gordon Linoff Jul 23 '12 at 13:20
The question is, is there a better way to make the derived table, or is there something else I can use that will restrict the size of the derived table to make it more efficient. This is a simple example, the real query has more fields and more expressions in the where clause that may make the query better or worse. – stu Jul 23 '12 at 13:23
I got that. The answer is basically "no". Sybase compiles the query into its own internal representation, which should, theoretically, be the best query plan given the information it has. Sometimes engines are wrong and you can tweak them, but your query is just fine. If you have a specific question about a query plan and performance, then you should post that as a separate question. – Gordon Linoff Jul 23 '12 at 13:28

Is there some equivalent to subquery correlation when making a derived table?

3 Answers3