I have a collection of documents which belongs to few authors:
[
{ id: 1, author_id: 'mark', content: [...] },
{ id: 2, author_id: 'pierre', content: [...] },
{ id: 3, author_id: 'pierre', content: [...] },
{ id: 4, author_id: 'mark', content: [...] },
{ id: 5, author_id: 'william', content: [...] },
...
]
I'd like to retrieve and paginate a distinct selection of best matching document based upon the author's id:
[
{ id: 1, author_id: 'mark', content: [...], _score: 100 },
{ id: 3, author_id: 'pierre', content: [...], _score: 90 },
{ id: 5, author_id: 'william', content: [...], _score: 80 },
...
]
Here's what I'm currently doing (pseudo-code):
unique_docs = res.results.to_a.uniq{ |doc| doc.author_id }
Problem is right on pagination: How to select 20 "distinct" documents?
Some people are pointing term facets, but I'm not actually doing a tag cloud:
- Distinct selection with CouchDB and elasticsearch
- http://elasticsearch-users.115913.n3.nabble.com/Getting-Distinct-Values-td3830953.html
Thanks,
Adit