Use index or replace values by surrogate ids and add a separate table?

Question

Currently I've got a couple of pivot tables with this structure in mysql:

user_id | organisation_id | type

4   3   external
4   3   internal
6   4   internal
7   4   external
6   5   external
7   5   external
7   6   internal

As you can see, in my type column I only use external and internal. Currently the type column is a enum and is indexed.

Is this a good database structure or should I make a separate table for the column type?

I've got this "problem" with a couple of tables so it's an important decision. The table could become quite big.

What would you do in this situation?

What does "make a separate table for the column type" mean? Please explain/illustrate. Do you mean, replace type values by surrogate ids and have a table associating ids & type values? — philipxy, Jul 23 '17 at 06:50
I already improved your title to say that. Please edit your question, comments are not for clarifications, they are ephemeral and for requesting & pinning down clarifications to be edited into posts. PS You need to be clear, precise and complete in questions. Applicable but vague descriptions/summaries of (parts of) what you are thinking of do not communicate what you are thinking of. Complete accompanying examples also help. — philipxy, Jul 23 '17 at 10:57
Does this answer your question? [Decision between storing lookup table id's or pure data](https://stackoverflow.com/questions/383026/decision-between-storing-lookup-table-ids-or-pure-data) — philipxy, Jun 11 '20 at 14:40

score 2 · Accepted Answer · answered Jul 22 '17 at 09:36

This depends a great deal on the kind of queries that you are going to run against this table.

If you plan to run queries that ignore the type column, i.e. do not care if it's internal or external, or queries that decide what type they want at run-time, e.g. through user input, there is a great advantage to keeping everything in a single table.

Generally, indexing a column with poor selectivity, i.e. with roughly half the rows in each of its two categories, can deliver only marginal performance improvements compared to a full table scan. Same goes for splitting the table in two: it will make queries more complex with very little improvement.

Use index or replace values by surrogate ids and add a separate table?

1 Answers1