We are working on a system where a person can send a document to another person, a document can have multiple attachments, as given below.
Document {
Set<Attachment> attachments;
}
If X sends a document (Doc1,Ver1) to Y and Y edits the document then we have to create a new version (Doc1,Ver2) of the document so that X sent box doesn't reflect the changes made by Y.
In our system there will be millions of documents where each document can have hundreds of attachments. Also a document can travel through n number of persons over a period of time.
We should be able to fetch any document version along with the list of attachments it had at that time, so I have to maintain versions of document, so the table structure that came to my mind immediately was the one below.
Document - id primary key, ver_id primary key
Attachment - id, doc_id foreign key, doc_ver_id foreign key
But the problem with this structure is that if X is sending a document to Y with 100 attachment and Y has made some minor modification then I have to create a new version along with copying all the attachments for the new version, most of which are same as the one in previous version, since we will be having millions of documents and each document will move through n number of persons this model will result in a very huge attachment table with lot of redundant data.
So we thought of an alternate structure for attachment, as below.
Document - id primary key, ver_id primary key
Attachment - id, doc_id, attached_ver_id, detached_version_id
But I couldn't create a hibernate entity for Document with this structure, so my question is, is there any other table structure which is better equipped to solve this problem with less redundancy and is it possible to create a hibernate mapping for the above table structure.