0

I have a document like

  Doc1 - {
   'category' : [c1,c2]
   'location': [l1,l2]
 }

  Doc2 - {
   'category' : [c1]
   'location': [l1]
 }

Doc2 array elements are subelements of Doc1 array.

Can I use mongo aggregation query to mark them as duplicate or should i use full text search on each array by unwinding them ?

I am open to changing the structure of document instead of array if it can meet my business requirement.

I need to restrict use from entering document with similar array inputs like above.

Business reason - User will provide one of the array value for each array key, and during evaluation, the system should have only one document matching or none.

  • You can $unwind by 'category' then 'location' and then 'groupBy' category and location. In this way you can 'push' the _id that match the group. At the end you will have an array of 'category+location' and inside this item you will have a list of all _id that match. At least you have one element. Or maybe 2 elements if duplicate are found. Can this work for you? – Daniele Tassone Aug 19 '18 at 11:57
  • This would solve my problem. I have a good number of such documents, going by this combination approach would be better or searching based on full text by creating full index on each array? – Rokesh Anumandla Aug 19 '18 at 18:02
  • Why an FullText Index should help? You still need to $unwind and $group – Daniele Tassone Aug 19 '18 at 19:27
  • you may also use filter to filter out documents https://docs.mongodb.com/manual/reference/operator/aggregation/filter/ – richi arora Aug 21 '18 at 15:04
  • @DanieleTassone I was able to find my solution using your approach. If you can post your comment as answer, I will accept it. – Rokesh Anumandla Aug 24 '18 at 19:22
  • @RokeshAnumandla done – Daniele Tassone Aug 25 '18 at 17:02

1 Answers1

0

You can $unwind by 'category' then 'location' and then 'groupBy' category and location. In this way you can 'push' the _id that match the group.

At the end you will have an array of 'category+location' and inside this item you will have a list of all _id that match. At least you have one element. Or maybe 2 elements if duplicate are found.

Daniele Tassone
  • 2,104
  • 2
  • 17
  • 25