0

Recently I use Haystack and whoosh for keyword search in my django project.But I use SearchQuerySet to filter with "__contains" and return error result.There is model and index.

class Team(models.Model):
   name = models.CharField(max_length=NAME_MAX_LENGTH, default='')
   leader = models.CharField(max_length=NAME_MAX_LENGTH, default='')
   slogan = models.CharField(max_length=SHORT_TEXT_LENGTH, default='')
   about = models.CharField(max_length=LONGTEXT_MAX_LENGTH, default='')
   b_type = models.IntegerField(default=0)
   ...

class TeamIndex(indexes.SearchIndex, indexes.Indexable):
   text = indexes.CharField(document=True, use_template=True)

   team_name = indexes.CharField(model_attr='name')
   team_logo = indexes.CharField(model_attr='logo_path')
   team_about = indexes.CharField(model_attr='about')
   team_type = indexes.CharField(model_attr='b_type')

   def get_model(self):
       return Team

   def index_queryset(self, using=None):
       return self.get_model().objects.all()

As fallow, I want search some result which contains key words.such as use "student" to match "The student is good.".

condition = reduce(operator.and_, (Q(content__contains=x) for x in keys))
res = SearchQuerySet().filter(condition).models(model)

But It also return null.So I look up the indexes which whoosh return.It can return a good result.

enter image description here

But when I use haystack to filter the result, It return error result.

(1)"__contains" looks like "__exact"

>>> SearchQuerySet().filter(text='rw\n').count()
3
>>> SearchQuerySet().filter(content='rw\n').count()
3
>>> SearchQuerySet().all().filter(content__contains='w').count()
0
>>> SearchQuerySet().all().filter(text__contains='w').count()
0

(2)"__exact" return error result

>>> SearchQuerySet().filter(text__contains='y\n1231').count()
3

But I only have one index which match "y\n1231".

Exceptally, I try some ways but fails.

  1. use "NgramField" or "EdgeNgramField" instead of "CharField"
  2. use "SearchQuerySet().exclude(content="XXX").filter(content__contains='w').count()"

Ps:

Python: 3.5.2
Django: 1.10.5
django-haystack: 2.6.0
whoosh: 2.7.4
jieba: 0.38

1 Answers1

0

In my opinion, I finally solve the problem.I want to share my error if some other one meet it again.Before, I only use one charter for keyword, so whoosh never create index on one character. such as "a", "b","c".I use whoosh API to debug the code.

>>> from whoosh.index import open_dir
>>> ix = open_dir('whoosh_index')
>>> searcher = ix.searcher()
>>> list(searcher.lexicon("text"))
[b'1231', b'about', b'jack', b'rw', b'tom']

I think it needs more characters if whoosh want to work.such as "jack", "about".