0

When a distributed search is done, the initial query is forwarded to all shards that are part of the specific collection that we are querying.

My question here is, Which is the machine that does the aggregation for results from shards?

Is the machine which receives the initial request?

Gray
  • 7,050
  • 2
  • 29
  • 52
Yago Riveiro
  • 727
  • 13
  • 28

1 Answers1

1

You're right. Moreover, there are several stages, all of which are managed by the same node: 1. Send to all shards (one member of each), gather paged results and create a single page of them 2. If grouping is enabled, request grouped results from corresponding shards 3. Request field values from the shards which hold the final set of documents

lexk
  • 761
  • 3
  • 7
  • Aggregating data is very memory expensive for facets and my machines doesn't have a lot of memory. I'm wondering if I put a machine with a big heap as a responsable for do the last final aggregation for all shard responses, I will remove pressure from "little" machines. – Yago Riveiro Oct 21 '13 at 15:54
  • Unfortunately it won't help you as 'the final aggregation' is a lightweight operation. Most of the work is done on each and every node internally – lexk Oct 22 '13 at 17:31