6

We have an object with nested properties which we want to make easily searchable. This has been simple enough to achieve but we also want to aggregate information based on multiple fields. In terms of the domain we have multiple deals which have the same details with the exception of the seller. We need consolidate these as a single result and show seller options on the following page. However, we still need to be able to filter based on the seller on the initial page.

We attempted something like the below, to try to collect multiple sellers on a row but it contains duplicates and the creation of the index takes forever.

Map = deals => deals.Select(deal => new
{
    Id = deal.ProductId,
    deal.ContractLength,
    Provider = deal.Provider.Id,
    Amount = deal.Amount
});

Reduce = deals => deals.GroupBy(result => new
{
    result.ProductId,
    result.ContractLength,
    result.Amount
}).Select(result => new
{
    result.Key.ProductId,
    result.Key.ContractLength,
    Provider = result.Select(x => x.Provider).Distinct(),
    result.Key.Amount
});

I'm not sure this the best way to handle this problem but fairly new to Raven and struggling for ideas. If we keep the index simple and group on the client side then we can't keep paging consistent.

Any ideas?

Alex Jones
  • 61
  • 5
  • `Provider = result.Select(x => x.Provider).Distinct()` you can't do this. Map/Reduce is distributed and no point can you assume you *ever* have the entire collection of Providers. The only trustworthy linq operators in the reduce are ones like `Count()` and `Sum()` because they are associative – Chris Marisic Dec 11 '15 at 16:24
  • I know this is distributed in platforms like Hadoop but are you sure this is actually distributed in RavenDB? – Alex Jones Dec 13 '15 at 19:25
  • yes, you cannot ever depend on having the full object set in reduce. That's not to say you won't have it, maybe even having the full set 99% of the time, but even if 1% of the time you don't you'll lead yourself into a minefield of misleading data. Running a select for `.Provider` in the reduce like that, you're ensuring your index will have data missing from it. – Chris Marisic Dec 14 '15 at 16:46

1 Answers1

1

You are grouping on the document id. deal.Id, so you'll never actually generate a reduction across multiple documents. I don't think that this is intended.

Ayende Rahien
  • 22,925
  • 1
  • 36
  • 41
  • Apologies Ayende! This situation was created in error when I was trying to simplify the example for posting. I have updated the question to reflect the actual problem. Think of the Id as a shared Id for linked product information. – Alex Jones Dec 13 '15 at 19:29