0

I have 3 tables representing non generic many-to-many connection:

packages: package_id

items: item_id, package_id

users: user_id, package_id

Thus I am connecting users to items in such a way that, one user can be connected to several items, and items can be connected to several users. That is limited to the case of two users can't be connected to several items with partial intersection. And I am fine with that - natural limitation for my case. Packages here is a sort of join table and is used for managing package_id with its auto increment property.

Typical select in my task - find all items connected with a given user. And that select can be done not using packages table at all.

Thus I can remove Packages table from this scheme completely. The only thing left to do - careful choice for package_id for new items groups, which is rather doable.

The question is: why is it a bad idea?

iburyl
  • 105
  • 1
  • 4
  • To clarify, are you just asking what the downsides are to dropping the package table? How will new package_ids be defined? – Dan J May 20 '15 at 20:34
  • Yes. E.g. as max+1 for package_ids from items table. – iburyl May 20 '15 at 20:36
  • 1
    There are a variety of problems with that approach, mostly around concurrency (see [this question](http://stackoverflow.com/q/8956044/238688), for example). You might be better off either keeping the packages table or at least turning items.package_id into an auto-incrementing field. – Dan J May 20 '15 at 20:48
  • Thanks, concurency here might really be a problem – iburyl May 21 '15 at 05:31

1 Answers1

2
  1. You may use packages table to manage data consistency. I.e. if you need to find package_id from items not in you directory.
  2. You may use packages table to retreive some additional information about packages like packages.name.
  3. You may use packages table to find packages not present in items or users table.

If you need to do something from listed above it's a bad idea to remove packages table from your scheme.

  • With MySQL, 10000 rows in packages table, 10000 rows in users table and 20000 rows in items table, all generated randomly. I got about 20% performance benefit for SELECT with 2 tables vs. select with 3 tables – iburyl May 21 '15 at 15:35