3

I am redoing my search app in Whoosh from Solr. I am now learning from the quick start. But I kept running into problems each time I had to deal with strings

>>>writer.add_document(iden=fil, content=F2T.file_to_text(fil_path)) ValueError: 'File Name.doc' is not unicode or sequence

and then:

>>>query = QueryParser("content", ix.schema).parse("first")
AssertionError: 'first' is not unicode

And THAT line comes straight from the quick-start turorial! Does Whoosh require all fields to be in unicode? It will be real hard work to make my app unicode-aware (and its not even worth it). As for "not unicode or sequence", I understand that string is also a sequence data type.

Jesvin Jose
  • 22,498
  • 32
  • 109
  • 202
  • Why don't you ask on the mailing list or forum for Whoosh? – Thomas K Aug 01 '11 at 12:11
  • Hmm is that the best choice for Whoosh queries? – Jesvin Jose Aug 01 '11 at 12:16
  • Well, if you've got a question about a specific piece of software, you're more likely to get an answer by asking the people who know about it, rather than posting it on a general programming Q&A website. – Thomas K Aug 01 '11 at 16:29
  • 6
    @Thomas K. The mailing list is not a great format. It's hard to read code samples and it lacks, well, frankly everything that makes Stackoverflow so good. – seanieb Aug 17 '11 at 00:50
  • @seanieb: Undoubtedly. But a mailing list reliably gets your question to the people who know about a specific piece of software, and there are tools like pastebins to get around their limitations. SO is a great tool for asking questions, but it's no use if the people who read it don't know the answer. This question also points out that the tutorial is incorrect, which is something to tell the developers. – Thomas K Aug 17 '11 at 12:01

1 Answers1

9

Yes, it requires strings are in Unicode.

 query = QueryParser("content", ix.schema).parse("first")

Change that to:

query = QueryParser("content", ix.schema).parse(u"first")
seanieb
  • 1,196
  • 2
  • 14
  • 36