But, in a MongoDB database, as far as I know, when I store a Department object it stores all associated Employees. Is not that duplicating the information?
First of all, the statement above is not correct. From the MongoDB's perspective, whatever is provided as BSON is stored as it is. If you provide employees with the department then yes, it should. You can apply partial updates after creating the department... (e.g. using $set
operator). But, I think the scope of your question is broader than this.
IMHO, creating nano-services for each document/table in the database is not a good approach. Especially, when the services only responsible for basic CRUD operation. You should first define your bounded contexts, aggragate roots and etc... In short, do not try to design tables before mapping business requirements to domain objects. What I'm trying to say is use DDD principles :)
These are the strategies that I found so far. When designing microservices you should also consider pros and cons of each strategy. (See bottom for references.)
General Principles of Mapping Relational Databases to NoSQL
- 1:1 Relationship
- Embedding
- Link with Foreign Key
- 1:M Relationship
- Embedding
- Linking with Foreign Key
- (Hybrid) Bucketing Strategy
- N:M Relationship
- Two-Way Referencing
- One-Way Referencing
1:1 Relationship
The 1:1 relation can be mapped in two ways;
- Embed the relationship as a document
- Link to a document in a separate collection
Tables:
// Employee document
{
"id": 123,
"Name":"John Doe"
}
// Address document
{
"City":"Ankara",
"Street":"Genclik Street",
"Nr":10
}
Example: Embedding (1:1)
- Advantage: Address can be retrieved with a single read operation.
{
"id": 123,
"Name":"John Doe",
"Address": {
"City":"Ankara",
"Street":"Genclik Street",
"Nr":10
}
}
Example: Link with foreign key (1:1)
{
"id": 763541685, // link this
"Name":"John Doe"
}
Address with document key;
{
"employee_id": 763541685,
"City":"Ankara",
"Street":"Genclik street",
"Nr":10
}
1:M Relationship
Initial:
// Department collection
{
"id": 1,
"deparment_name": "Software",
"department_location": "Amsterdam"
}
/// Employee collection
[
{
"employee_id": 46515,
"employee_name": "John Doe"
},
{
"employee_id": 81584,
"employee_name": "John Wick"
}
]
Example: Embedding (1:M)
Warning:
- Employee list might be huge!
- Be careful when using this approach in write-heavy system. IO load would increase due to housekeeping operations such indexing, replicating etc.
- Pagination on employees is hard!!!
{
"id": 1,
"deparment_name": "Software",
"department_location": "Amsterdam",
"employess": [
{
"employee_id": 46515,
"employee_name": "John Doe"
},
{
"employee_id": 81584,
"employee_name": "John Wick"
}
]
}
Example: Linking (1:M)
We can link department_id from employee document.
- Advantage: Easier pagination
- Disadvantage:
Retrieve all employees that are belong to department X.
This query will need a lot of read operations!
[
{
"employee_id": 46515,
"employee_name": "John Doe",
"department_id": 1
},
{
"employee_id": 81584,
"employee_name": "John Wick",
"department_id": 1
}
]
Example: Bucketing Strategy (Hybrid 1:M)
We'll split the employees into buckets with maximum of 100 employees in each bucket.
{
"id":1,
"Page":1,
"Count":100,
"Employees":[
{
"employee_id": 46515,
"employee_name": "John Doe"
},
{
"employee_id": 81584,
"employee_name": "John Wick"
}
]
}
N:M Relationship
To choose Two Way Embedding or One Way Embedding, the user must establish the maximum size of N and the size of M.
For example; if N is a maximum 3 categories for a book and M is a maximum of 5,000,000 books in a category you should pick One Way Embedding.
If N is a maximum 3 and M is a maximum of 5 then Two Way Embedding might work well. schema basics
Example: Two-Way Referencing (N:M)
In Two Way Embedding we will include the Book foreign keys under the book field in the author document.
Author collection
[
{
"id":1,
"Name":"John Doe",
"Books":[ 1, 2 ]
},{
"id":2,
"Name": "John Wick",
"Books": [ 2 ]
}
]
Book collection:
[
{
"id": 1,
"title": "Brave New World",
"authors": [ 1 ]
},{
"id":2,
"title": "Dune",
"authors": [ 1, 2 ]
}
]
Example: One-Way Referencing (N:M)
Example Books and Categories: The case is that several books belong to a few categories but a couple categories can have many books.
- Advantage: Optimize the read performance
- The reason for choosing to embed all the references to categories in the books is due to the fact that being lot more books in a category than categories in a book.
Catergory
[
{
"id": 1,
"category_name": "Brave New World"
},
{
"id": 2,
"category_name": "Dune"
}
]
An example of a Book
document with foreign keys for Categories
[
{
"id": 1,
"title": "Brave New World",
"categories": [ 1, 2 ],
"authors": [ 1 ]
},
{
"id": 2,
"title": "Dune",
"categories": [ 1],
"authors": [ 1, 2 ]
}
]
References
- Case study: An algorithm for mapping the relational databases to mongodb
- The Little MongoDB Schema Design Book
- 6 Rules of Thumb for MongoDB Schema Design