2

I'm really interested to know whether there is any Performance gain using JOIN as compared to Cartesian Product for equivalent SQL queries.

Or Joins are just the Syntactic Sugar for Cartesian Product.

Can anyone let me know the implementation of Cartesian Product and JOIN. Is execution wise both takes same Time Complexity. Or implementation differs from Database to Database.

I have read through the following post, but it seems there are still split views on this:

Performance of Inner Join vs Cartesian product

Explicit vs implicit SQL joins

Community
  • 1
  • 1
bharatj
  • 2,327
  • 1
  • 25
  • 25
  • 1
    From what you wrote, all I can conclude is that you are optimizing something while not knowing what Cartesian Product is, nor what a JOIN is. Clear case of XY problem. Is your question about theta join vs regular join? – N.B. Aug 04 '15 at 10:14
  • I have asked for equivalent queries. For the same result, when done with Cartesian product(along with where filter) and doing the same thing using JOIN. In case of CP: Does It create the Cartesian product first and then perform the "where" clause filters. or filtering happens while inside the second inner loop. (In case we are using two tables). As If Cartesian product is large enough then creating the result-Set in two passes takes good amount of time, whereas If result is filtered while iterating then second pass is not required, which is faster as compared to first approach. – bharatj Aug 04 '15 at 10:28
  • Cartesian product is a mathematical operation, and in SQL as you put it, it's a JOIN. Therefore a JOIN can't be faster than JOIN if both are the same. The one you linked is called theta-style JOIN, and I take it that's what you think a Cartesian Product is. There is no "difference" between two JOINs, MySQL won't spend too much time parsing that info. However, you are free to conclude on your own whether it's quicker to retrieve the data you need versus all the rows timed with all the columns. You're mixing apples and oranges and you're not helping anyone help you. XY problem, read about it. – N.B. Aug 04 '15 at 10:37
  • @NB : I understand what you are trying to convey(mixing relational algebra names with SQL, even though I have clearly explained CP with where clause), but still not getting the answer for what I'm asking. If possible then let me know are the below queries are same performance wise : (select * from A inner join table B on A.id =B.id;) and (select * from A, B where A.id = B.id;). How the query with join execute and is different from second query. How execution for these queries happens. If possible could you let me know programmatically how these queries execute to generate the desired result. – bharatj Aug 04 '15 at 11:02
  • The two queries in your comment **are not the same**. The second one returns more data. It's slower. – N.B. Aug 04 '15 at 11:05
  • Ok, lets assume both return same amount of data, then ? – bharatj Aug 04 '15 at 11:43
  • Right, having re-read your example queries - I first have to *apologize* - the two queries are identical. The only difference is in the style. Actual lexing and MySQL execution engine will do the same job. There is no difference in how fast both will be executed. – N.B. Aug 04 '15 at 12:14

2 Answers2

0

Can't comment, not enough rep. But Cartesian Product provides all columns, instead of the ones you specifically need. This won't cause performance issues on the database, but when you come to use the data extra work may be needed to format what you need to display to the end user, causing performance issues in the code. I'd refrain from cartesian as databases grow, and potentially expand, bugs can be introduced.

Chris
  • 99
  • 3
  • 14
0

I may be entirely wrong, but here's my view. In some university database classes, they refer to the method of joining tables described by the OP as cartesian joins, since based off of the cartisian product of matrices or vectors, versus the explicit join clause, which is probably the reason the OP referred as such. Mathematics is the foundation of computers as recalled, and it's used for the lexical and logical specifications of the SQL language, specifically relational algebra and relational calculus. We learned beneath the covers of SQL in college not per say the industry jargon. I think this is the reason for the disconnect between academia and industry, theoretical versus the market/ marketable applications based on trending business practices. I also think it's not a good idea to incorporate business in academia unless a business-related class for various reasons. Maybe, a little humility will help the human race progress towards greater enlightenment and improvement. There are many quotes from the greatest minds about the illusions created with our senses of knowledge and such; forge no limits and rigid understandings.

K. Delery
  • 1
  • 3