0

I'm a developer with limited experience of DB architecture. I just started to rough up what my DB for rethinkdb should look like and I came up with something like this:

  • User

    • Project IDs (including access level)
    • ...
  • Project

    • Or should this have User IDs (and access level)?
    • Language IDs (1 to many)
    • ...
  • Language

    • Or should this have a Project ID instead?
    • ...
  • Language Entry

    • Language ID
    • ....

It occurred to me that the IDs which join the data could go in a couple of different places:

  • Should a User hold the Projects they belong to and other data like access level, or should a Project hold the Users?
  • Should the Project hold Language IDs, or should a Language hold the Project ID?

What's best practice/more efficient?

My guess is that it's more efficient to do:

  • User access site, we know the Project IDs so can grab them
  • Project IDs know the Language IDs so can grab them
  • Language knows the Language Entry IDs so can grab them

Especially where there could be a LOT of Language Entries across all projects, my assumption is that it would be slow to iterate through them all looking for matching Language IDs.

Dominic
  • 62,658
  • 20
  • 139
  • 163

1 Answers1

1

This isn't really a matter of efficiency, it falls under the discipline of logic and physical data modelling. The relational database should be created so that it can accommodate the real world situations (like a car manufacturer having multiple car models), but also be constrained so that it prevents any situations that cannot exist in the real world (like a car model having two manufacturers).

Here's some logic to follow, based on understanding the real-world situation.

  1. If a language can only ever have one project, but a project can have many languages, then put project_id on languages. language belongs_to project and project has_many languages
  2. If a language can have many projects, but a project can only ever have one language, then put language_id on projects. language has_many projects and project has_many languages
  3. If a language can have many projects, and a project can have many languages, then create a new project_language model with both id's. language has_many projects through project_languages and project has_many languages through project_languages

I suspect that users and projects are in the third category, so you need the intermediate table, which could also store privileges.

With the correct associations, scopes etc. you'll then be able to do such magic as @user.projects, @project.users, @project.languages, @user.project_languages - @user.languages

David Aldridge
  • 51,479
  • 8
  • 68
  • 96
  • Thanks in this case languages are created by the users for their projects so it probably falls under (1). I'll have a closer read of your logic and try and understand it and start implementing something with it tomorrow - thanks – Dominic Nov 09 '15 at 23:39