-1

For any two given tables, say table 1 has i rows and table 2 has j rows, can someone explain to me how it is possible that we end up with a total of i * j rows in a LEFT OUTER JOIN if table 1 is the "left" table ? I always thought that a LEFT OUTER JOIN would always return a total # of rows in the left table ?

oneCoderToRuleThemAll
  • 834
  • 2
  • 12
  • 33
  • It sounds like your creating a cross-join somehow. If you provide your SQL query people could help – Doug Oct 05 '18 at 02:07
  • Hi. This is a faq. Please always google many clear, concise & specificversions/phrasings of your question/problem/goal with & without your particular strings/names & read many answers. Add relevant keywords you discover to your searches. If you don't find an answer then post, using 1 variant search as title & keywords for tags. See the downvote arrow mouseover text. PS Learn what LEFT JOIN returns: INNER JOIN rows plus unmatched left table rows extended by NULLs. Always know what INNER JOIN you want as part of a LEFT JOIN. – philipxy Oct 05 '18 at 05:10
  • PS Your expectations probably assume a FK from the left table to the right table & a join on equality of corresponding FK-PK columns. – philipxy Oct 05 '18 at 07:39

3 Answers3

4

In an inner join, for each row from the "left" table there will be as many rows in the output as there are matching rows in the "right" table (matching on the join conditions, that is); this can be anything between 0 and j. So an inner join may return anywhere between 0 and i*j rows. Both are possible, by the way; just consider the join condition null is not null (to get 0 rows), or null is null (to get a cartesian join).

The only difference in an outer join (specifically, left outer join) is that for each row from the "left" table there will be at least one row in the output - even if there is no matching row in the right table. That's really what outer join means. So the only difference is that in a left outer join, the output will have between i and i*j rows, and again both are possible (with the same join conditions as above).

To your question about getting the max number of rows - for a somewhat more "natural" example, imagine both tables have a column purchase_date, and for some reason all rows in both tables have exactly the same (non-null) date in that column. Then if you join on left_table.purchase_date = right_table.purchase_date you will get a cartesian join, which has i*j rows.

1

A left join returns all the rows in the first table. For each row, it returns all the rows in the second table that match the on conditions. If no rows match the on conditions, then the row from the first table is still returned, with the additional columns NULLed out.

So, if you want i * j rows, just do:

from t1 left join
     t2
     on 1 = 1
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I was wondering how was it possible that we can end up with i * j, not how I can get i * j rows...but I guess your answer kind of covers it(edited the question to reflect this though) – oneCoderToRuleThemAll Oct 05 '18 at 02:06
  • 1
    @GdgamesGamers . . . I think this answers the question. – Gordon Linoff Oct 05 '18 at 02:19
  • Left join does not in general return rows with only the columns of the left table, so "A left join returns all the rows in the first table" & "it returns all the rows in the second table" don't make sense. Something like them that you didn't write makes sense. Writing should always be clear but since @GdgamesGamers doesn't understand something so basic as left join--possibly because of reading ubiquitous unclear writing like this--lack of clarity is all the more problematic. – philipxy Oct 05 '18 at 06:32
  • 1
    @philipxy you appear to have missed the end of Gordon's statement - "it returns all the rows in the second table **that match the on conditions**". – Boneist Oct 05 '18 at 07:30
  • @Boneist Your comment does not reflect what I wrote or what the sentence you quote actually says. That longer quote is even more confused than what I quoted. No row returned by left join is a row from either input table. Certain input rows appear as *subrows* in result rows in a certain way. "Something like them that you didn't write makes sense." PS I did accidentally misleadingly say "does not in general" instead of "does not" (relations unlike SQL tables can have zero columns) & "of the left table" instead of "of either input table" (sloppy proofreading after edits.) – philipxy Oct 05 '18 at 08:15
  • @philipxy sorry, but you're not making much sense either. Clearly, the output of a join query contains information from the rows in each of the joined tables. Gordon is clearly saying that rows in the the output will consist of each row from the left table and any matching rows (based on the `on` conditions) in the right table for that left-table's row. That means 1 row in the left table may join to 0, 1 or more rows in the right table. – Boneist Oct 05 '18 at 08:16
  • @Boneist Please see my edited comment that crossed yours. Your comments don't reflect what is actually written. He is quite obviously not clear & I nevertheless explain an example. Read what is written & not what you think is written. And your own phrasing is unclear too. (I didn't say that what he is trying to say is wrong. I'm saying what he wrote doesn't say what he is trying to say.) But I repeat myself. – philipxy Oct 05 '18 at 08:24
0

The total number of rows can be i * j, or less then i * j depending on the join conditions and the data. Here are 2 examples.

CREATE TABLE t1(
   id   VARCHAR(1)
);

INSERT INTO t1(id) VALUES ('a');
INSERT INTO t1(id) VALUES ('b');
INSERT INTO t1(id) VALUES ('c');
INSERT INTO t1(id) VALUES ('d');

CREATE TABLE t2(
   id       VARCHAR(1)
  ,col2 INTEGER
);

INSERT INTO t2(id,col2) VALUES ('a',1);
INSERT INTO t2(id,col2) VALUES ('a',2);
INSERT INTO t2(id,col2) VALUES ('a',3);
INSERT INTO t2(id,col2) VALUES ('a',4);
INSERT INTO t2(id,col2) VALUES ('d',1);

This following query produces a count of 20, the equivalent of a cross join, but this is BY ACCIDENT due to imprecise join conditions.

select 'query 1', count(*) rows
from t1
left join t2 on t2.col2 > 0
;

all rows in t2 have col2 > 0, so every row of t1 is multpled by every row in t2 (4 t1 rows * 5 t2 rows = 20 rows in joined result).

This following query produces a count of 7, less than i * j, because the join condtions are precise

select 'query 2', count(*) rows
from t1
left join t2 on t1.id = t2.id and t2.col2 > 0
;
  1. the 'a' in t1 is matched to 'a' in t2 (4 rows)
  2. 'b','c' in t1 are not matched (2 rows)
  3. 'd' is matched but just to 1 row in t2 (1 row)
  4. so, 4 + 2 + 1 = 7 rows in total
Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51