1

I have an index of git repositories and I'm saving the name of the repository the file belongs to in each document. The repository field format is {section}/{repo} and it is a TEXT field. I want to achieve pretty simple thing: list of all repositories in the index, i.e. list of all unique values in the repo field.

When i'm using the

result = searcher.search(query.Every(), groupedby="repo")
for item in result.groups():
    print(item)

the value are printed with the field values split on the "/", so i'm effectively loosing the {section} portion of the repo value.

I have added sortable=True to the repo field and reindexed the whole thing. Now it is returning the correct format of the repo value, but only one when i expect it to be about 10. I see that the _facetmaps field in the "result" object has incorrect values - missing all the repos except one.

Eugene Sajine
  • 8,104
  • 3
  • 23
  • 28

1 Answers1

1

The current solution seems to be to make repo an ID field an use searcher.lexicon("repo") to get the list of unique values

https://bitbucket.org/mchaput/whoosh/issue/407/searched-with-groupedby-returns-incorrect

Eugene Sajine
  • 8,104
  • 3
  • 23
  • 28