-1

I'm wondering which method below is faster?

Suppose:

  • Maximum 10,000 products, each product has 1 user id, 1 cat id, 3 extra fields, and 5 images.
  • 90-99% users come to the website just for the information, not posting.

Method 1: get all data from a table from a query without "JOIN":

SELECT * FROM products WHERE ...

Table: products

id | name | poster_name | cat_name | code_1 | code_2 | content |
       dimensions | contact | message | images |

Method 2: get all data from multiple tables with "JOIN":

SELECT ... FROM products
           LEFT JOIN cats ON products.cat_id = casts.id
           LEFT JOIN users ON ....

table: products

id | name | code_1 | code_2 | content | cat_id | poster_id |

table: cats

id | cat_name |

table: users

id | poster_name |

table: extra

id | product_id | extra_info | extra_data |

table: images

id | product_id | img_src |
Rick James
  • 135,179
  • 13
  • 127
  • 222
Louis Tran
  • 1,154
  • 1
  • 26
  • 46
  • 1
    The performance of queries is dependent on a lot of things, and may vary from system to system. E.g., did you define any indexes? How many records are in these tables? A query on one table is almost always going to be faster than a join but it depends on other things. If your one table query gets a million records but your join only fetches 10, the join will probably (but not definitely) be faster. Obviously a simple test on your part would be the best answer. – S. Imp Dec 16 '16 at 19:37
  • Does this answer your question? [Is it better to store redundant information or join tables when necessary in MySQL?](https://stackoverflow.com/questions/3237033/is-it-better-to-store-redundant-information-or-join-tables-when-necessary-in-mys) – philipxy Apr 14 '22 at 22:06

2 Answers2

1

The first method will usually be faster for reads, and the second one will help you maintain data integrity and usually will be faster for writes.

The transition from the later form to the former is called denormalization and is usually used in data warehouses, while operational ("live") databases usually prefer the later form (second method).

Jiri Tousek
  • 12,211
  • 5
  • 29
  • 43
  • If most users (90-95%) just visit to read the information on the website. And database structure doen't need to be changed frequently. Is the method 1 better in the case? – Louis Tran Dec 16 '16 at 19:49
  • 1
    That depends on your particular setup and needs. I would personally only consider method 1 for that kind of database if I was sure I absolutely need the performance increase, and that it cannot be gained any other way (e.g. improving indexes, tuning execution plans etc.). Also note that the method 1 consumes more space. – Jiri Tousek Dec 16 '16 at 19:53
  • When you said "method 1 comsumes more space". that meant you counted NULL values in the combination table and if I used method 2: I could save those spaces? I think in case all fields contain data, both method consumse the same space. – Louis Tran Dec 16 '16 at 20:08
  • 2
    That's not true - imagine thousand of the entries have category "Home accessories". In first case, all of them will have this string saved in each row, that is 1000 x 17 characters, so at least 1000 x 17 bytes. While in the second case, there will only be ID stored for each row, and the category text will only be stored once, so it could be something like 1000 x 4 bytes + 1 x (4 + 17) bytes. – Jiri Tousek Dec 16 '16 at 20:41
  • You are right. Method 1 consumes more space. But I think I'm gonna go with it because my database is small (biggest table always less than 500 records) and method 1 is easy to maintain, no complex query involved. Thank you. – Louis Tran Dec 17 '16 at 01:28
1

You have not finished asking the question. Method 2 has no WHERE, so it will deliver 10K rows, plus have to do 20K lookups into the other tables. That makes it the loser.

Since your real question is about performance, then let's discuss the WHERE clause. With that, we can optimize it so that the desired data tends to be in RAM.

Back to your question... JOIN is probably the 'right' way to do it. And it is not that much of a performance hit assuming you have the proper indexes. So provide SHOW CREATE TABLE (even if tentative) and complete WHERE clauses.

Don't over-normalize. For example, do not normalize datetime or any other 'continuous' values.

Normalization can save space, especially in huge tables (eg, millions or billions of rows, and large, frequently repeated, strings being normalized.) This is especially helpful when the table is too big to stay cached in RAM.

Rick James
  • 135,179
  • 13
  • 127
  • 222