Let's say I've got some documents in an index. One of the fields is a url. Something like...
{"Url": "Server1/Some/Path/A.doc"},
{"Url": "Server1/Some/OtherPath/B.doc"},
{"Url": "Server1/Some/C.doc"},
{"Url": "Server2/A.doc"},
{"Url": "Server2/Some/Path/B.doc"}
I'm trying to extract counts by paths for my search results. This would presumably be query-per-branch.
Eg:
Initial query:
Server1: 3
Server2: 2
Server1 Query:
Some: 3
Server1/Some Query:
Path: 1
OtherPath: 1
Now I can broadly see 2 ways to approach this and I'm not a great fan of either.
Option 1: Scripting. mvel seems to be limited to mathematical operations (at least I can't find a string split in the docs) so this would have to be in Java. That's possible but it feels like a lot of overhead if there are a lot of records.
Option 2: Store the path parts alongside the document...
{"Url": ..., "Parts": ["1|Server1","2|Some","3|Path"]},
{"Url": ..., "Parts": ["1|Server1","2|Some","3|OtherPath"]},
{"Url": ..., "Parts": ["1|Server1","2|Some"]},
{"Url": ..., "Parts": ["1|Server2"]},
{"Url": ..., "Parts": ["1|Server2","2|Some","3|Path"]}
This way I could do something like. Urls starting with 'Server1/Some', facet on parts starting with 3|
. This feels so horribly hackish.
What's a good way to do this? I can do as much pre-processing as required but need the counts to be coming from ES as it's the count of results from a query that is important.