I am looking into a possible solution to quite basic question: is there a recommended practice to make sure a file being stored in GridFS is in fact not going to create a duplicate? We noticed that in very rare occasions it might happen that our store call, as simple as it can get (using Java driver) can in fact create a duplicate of the new file in case of parallel execution.
GridFS gridfs = new GridFS(db);
GridFSInputFile file = new GridFSInputFile(gridfs, fileContent, fileName, true);
file.put("type", "email");
file.setContentType(contentType);
file.save();
We are using FSYNC_SAFE as write concern in this case, collection is sharded. Should we avoid completely the usage of the mongo driver and go for direct writes into gridfs files collection, to add extra logic, or is it easier just to, after save is done, to check and remove duplicate (which is of course not optimal).