All words with single-character is considered as stop words in Whoosh by default and ignored. This means all letters and digits are ignored.
stop words are words which are filtered out before or after processing of natural language data (text). (ref)
You can check that StopFilter
has a minsize = 2
by default added to pre-defined set.
class whoosh.analysis.StopFilter(
stoplist=frozenset(['and', 'is', 'it', 'an', 'as', 'at', 'have', 'in', 'yet', 'if', 'from', 'for', 'when', 'by', 'to', 'you', 'be', 'we', 'that', 'may', 'not', 'with', 'tbd', 'a', 'on', 'your', 'this', 'of', 'us', 'will', 'can', 'the', 'or', 'are']),
minsize=2,
maxsize=None,
renumber=True,
lang=None
)
So You can resolve this issue by redefining your schema and removing the StopFilter
or using it with minsize = 1
:
from whoosh.analysis import StandardAnalyzer
schema = Schema(content=TEXT(analyzer=StandardAnalyzer(stoplist=None)))