4

Due to some dynamic SQL complications, I have a need to join two tables together and query from the result as if it were a single table. The twist is that I have to SELECT * on the inner query -- I can't alias the columns. There seems to then be no way to identify those columns in the outer query.

Take for example these tables

CREATE TABLE test_tab1 (join_key NUMBER, tab1_uqcol VARCHAR2(30));
CREATE TABLE test_tab2 (join_key NUMBER, tab2_uqcol VARCHAR2(30));
INSERT INTO test_tab1 VALUES (1, 'table 1 only');
INSERT INTO test_tab2 VALUES (1, 'table 2 only');

I can easily enough filter on their unique columns

SELECT *
FROM (
    SELECT *
    FROM test_tab1 t1
    INNER JOIN test_tab2 t2
    ON t1.join_key = t2.join_key
) joined
WHERE tab1_uqcol LIKE 'table%';

but if I try to refer to a column they have in common...

SELECT *
FROM (
    SELECT *
    FROM test_tab1 t1
    INNER JOIN test_tab2 t2
    ON t1.join_key = t2.join_key
) joined
WHERE join_key = 1;

then it tells me that join_key is an invalid identifier. Ditto t1.join_key, test_tab1.join_key, "t1.JOIN_KEY", "T1.JOIN_KEY", and "TEST_TAB1.JOIN_KEY". How do I identify these star-selected columns?

Colin P. Hill
  • 422
  • 4
  • 18
  • 2
    If you're using dynamic SQL, why can't you list out the columns you want? Can you use the `USING` clause rather than the `ON` clause to do the join so that `join_key` is only returned once? – Justin Cave Jun 26 '15 at 16:41
  • You select just the columns you need. In your example your subquery returns joined_key twice. Why would you need that? – Dan Bracuk Jun 26 '15 at 16:43
  • 1
    Is that even valid in Oracle? Sql Server wouldn't allow two columns with the same name in a derived table. You'd get an error `The column 'join_key' was specified multiple times for 'joined'.` – Martin Smith Jun 26 '15 at 16:45
  • Have you tried WHERE Expr1=1 – Greg Viers Jun 26 '15 at 16:48
  • 1
    I was wondering the same thing @MartinSmith. @Malnormalulo... Why do you need to use `Select *`? Why can't you just explicitly list the columns you need in the derived table? – Samizdat Jun 26 '15 at 16:51
  • @Justin: Because this block of SQL is going to be reused in a large number of procedures, and I cannot necessarily predict what columns the WHERE clause (generated by a separate function, not of my design) will want -- just that they will come from these two tables. `USING` won't work, because some columns they have in column are nullable (again, not my design). – Colin P. Hill Jun 26 '15 at 16:59
  • 1
    What does the nullability of columns have to do with being able to use a `USING` clause rather than an `ON`? Even if you can't predict what columns will be needed, surely you can dynamically select every column and simply not select `join_key` twice. – Justin Cave Jun 26 '15 at 17:05
  • I might need to select a nullable column. Naming all shared columns in the `USING` clause will give them all names that I can identify -- but it only joins on equality, not on mutual nullity. Dynamically naming the columns with dba_tab_cols might be a feasible workaround...I like that idea, and might use it. But it still leaves me wondering about the presented problem as an academic point! – Colin P. Hill Jun 26 '15 at 17:13
  • You don't need to name all shared columns in the `USING` clause, just those that you are joining on. Are you saying that your actual join logic includes many columns some of which are `NULL` where you want the join to treat keys as equal where they are both `NULL`? Are you saying that your actual case has columns of the same name in both tables that are not part of the join? Or am I misunderstanding your point? – Justin Cave Jun 26 '15 at 17:21
  • Yes, some shared nullable columns are not used in the join (though they could -- it wouldn't hurt, I'd just need to make sure I join on equality OR both being null). Any shared columns not named in the USING clause (because they can't be, because they're nullable) will not benefit from that proposed solution -- I'll still be lacking a way to identify them in the outer query. – Colin P. Hill Jun 26 '15 at 17:28

2 Answers2

1

You don't need to use Aliases, but you do need to be more explicit with naming columns in your derived table. For example...

    SELECT *
    FROM (
        SELECT t1.join_key, t1.tab1_uqcol, t2.tab2_uqcol
        FROM test_tab1 t1
        INNER JOIN test_tab2 t2
        ON t1.join_key = t2.join_key
    ) joined
    WHERE join_key = 1;

Since you're using an inner join there is no need to differentiate between the two different join_key columns. You just need to list one of them in your inner query so that there is no confusion in the outside query.

Samizdat
  • 138
  • 8
  • Possibly a workaround, but it doesn't answer the question as stated – Colin P. Hill Jun 26 '15 at 17:15
  • An Alias isn't what you think it means. t1.join_key AS t1_join_key is an example of an alias. t1.join_key is an example of an explicit column identifier. Sam's solution is explicitly listing the subset of columns to be returned WITHOUT using aliases. You can't have duplicate columns in a subquery. There is no way around that. All columns of a subquery must have unique identifiers. SELECT * doesn't change this requirement. It is just a shortcut for all the columns which in your case do not have unique identifiers. – Gerald Davis Jun 26 '15 at 17:31
  • I haven't misunderstood what an alias is -- the title is a summary. In the question proper I specified that I must `SELECT *` in the inner query. And despite what SQL Fiddle says, it *is* valid in Oracle 11g -- I'm staring at the result right now. – Colin P. Hill Jun 26 '15 at 17:37
  • That said, it does stand to reason that columns must have unique identifiers. Which is why the question asks what that unique identifier *is* in this case. – Colin P. Hill Jun 26 '15 at 17:38
  • Yes, it's an interesting academic question since Oracle appears to allow these duplicate columns in the inner query. Other RDBMS simply don't allow it. Personally, I think this is a bug in Oracle... not a feature that should be exploited. See this thread for a similar discussion on column ambiguity in Oracle 10g. http://stackoverflow.com/questions/6293193/why-oracle-10g-doesnt-complain-about-column-ambiguity – Samizdat Jun 26 '15 at 23:09
1

Your queries are not valid SQL. The columns specified in a subquery must be unique and * is simply a syntactic sugar for all columns. Even the one you said is 'easy enough' is invalid.

It appears I was incorrect. Oracle does not seem to follow ANSI SQL and allows subqueries which can produce ambiguous column sets. This was compounded by the fact that SQLFiddle requires inserts to be done on the DML side so it provided a false positive.

SQL Fiddle

Schema Setup (Oracle 11g R2)

CREATE TABLE test_tab1 (join_key NUMBER, tab1_uqcol VARCHAR2(30));
CREATE TABLE test_tab2 (join_key NUMBER, tab2_uqcol VARCHAR2(30));

Query 1:

INSERT INTO test_tab1 VALUES (1, 'table 1 only');
INSERT INTO test_tab2 VALUES (1, 'table 2 only');

SELECT *
FROM (
    SELECT *
    FROM test_tab1 t1
    INNER JOIN test_tab2 t2
    ON t1.join_key = t2.join_key
) joined

Results:

ORA-00900: invalid SQL statement

There are three ways to modify your queries to resolve this:

  1. Ensure all the tables have unique columns (i.e. join_key1, join_key2).
  2. Explicitly list all columns with dynamically produced aliases guaranteed to be unique such as tablename + columnname (i.e. t1.join_key AS test_tab1_join_key, t2.join_key AS test_tab2_join_key ...).
  3. Explicitly list a subset of columns by the second pair of join columns (i.e. t1.join_key, t1.tab1_uqcol, t2.tab2_uqcol).
  4. Avoid using subquery all together (SQLFiddle example).

All three have different drawbacks so it depends on how you need to use this query. The second one is guaranteed to produce referenceable column sets for all possible tables. The third method may be a little easier but it may fail if two tables share the same column name for a column other than the joining columns.

On edit: I stand corrected. You are right. Oracle does allow you to reference duplicate columns in the subquery. Even if Oracle allows you to do it, I would still avoid it as it is bad practice. It also will not produce valid SQL in most RDMS.

I will leave the answer because unless someone finds an undocumented way to reference a duplicate column name from an outer query you may have no choice but to modify the inner query.

Gerald Davis
  • 4,541
  • 2
  • 31
  • 47
  • 1
    It is valid SQL in Oracle 11g. I literally copied and pasted it from PL/SQL Developer where I drafted and tested it before writing up my question. – Colin P. Hill Jun 26 '15 at 17:31
  • 1
    It looks like you are correct. SQL Fiddle requires the inserts to be on the DML side. I updated my answer. It seems bizarre Oracle would allow non-unique column identifiers in a subquery but they do. Still I will leave the answer because unless someone comes along with some hack to identify one of many columns with same identifier in the outer query you will need to change the inner query. I think #2 above is probably going to be the least painful for you. – Gerald Davis Jun 26 '15 at 17:51
  • Fair enough. I do appreciate the workarounds, and I can hack my way around the problem. I suppose at this point I should take the answer to the question to be "there is no way to identify the column", because nobody seems to know of one! – Colin P. Hill Jun 26 '15 at 18:27
  • Well there is no way I know of. The question isn't that old yet. Someone may know of one but if they don't you have a workaround. I thought maybe one could use ordinal (it would be a pretty gross hack but say where [0] = 1) or whatever syntax would be needed to mean where zero column equals 1. It looks like Oracle only supports ordinal values in the order by clause so that is a dead end. – Gerald Davis Jun 26 '15 at 18:30