Questions tagged [whoosh]

Whoosh is a fast, featureful, full-text indexing and searching library implemented in pure Python.

Fast, pure-Python, full text indexing, search and spell checking library. Whoosh on the Python Package Index

Whoosh Documentation

373 questions
0
votes
1 answer

efficient Boolean search for large index with whoosh

I have created an index with fields (id, title, url, content) for storing web page information by crawling. Now i want to search that index with multiple word queries( also Boolean queries), suggest good n efficient searching algorithms(some…
WXy
  • 1
  • 3
0
votes
1 answer

Search inside a string with Whooshalchemy?

I'm building a text search with python and Flask-Whooshalchemy, currently I have something like this: Entry.query.whoosh_search('something').all() Which goes through body and title of Entry. However, it returns true only if content has that exact…
Demeter
  • 106
  • 1
  • 8
0
votes
1 answer

How to get tf-idf score and bm25f score of a term in a document using whoosh?

I am using whoosh to index a dataset. I want to retrieve the td-idf score and bm25f score given a term and document? I have seen the scoring.TFIDF() and scoring.TFIDFScorer(). In order to call TFIDFScorer().score() method we should pass a matcher…
0
votes
2 answers

Django Haystack Whoosh Tokenized Search

Lets say I have a record with this string 'hairdresser doing great job' in the search index. How do I make a search query 'hairdresser in Auckland' still return the record above in the search result? I tried this but I feel it's not the right way to…
James Lin
  • 25,028
  • 36
  • 133
  • 233
0
votes
1 answer

How to specify which DIR is to be used by haystack and whoosh during index-building

I am trying to index my MySQL Data into Whoosh using Haystack But as my root partition is almost Full is their any way to specify which DIR to be used by Haystack and Whoosh during indexing data. As it uses /tmp/ DIR during this process how can i…
Vaibhav Jain
  • 5,287
  • 10
  • 54
  • 114
0
votes
0 answers

How to index Django models into whoosh

I am trying to index my Django models into Whoosh.In this tutorial they are simply indexing text using content field but how do i index my Django model in such a way ... My models.py import json from django.db import models from django.contrib…
Vaibhav Jain
  • 5,287
  • 10
  • 54
  • 114
0
votes
1 answer

Build a whoosh scoring implementation that looks at how 'near' different terms are

I need my score to take into account only how close the terms (in a multi-term search) are. It looks like in implementing your own weighting function (the docs), you only get access to one term of the search at once, so cannot look at distance…
maged
  • 859
  • 10
  • 24
0
votes
0 answers

Why can I not search for a string I got via mylistOfStrings[0] when I add it to a whoosh index?

My wrapper for python's dict, which uses whoosh to keep a text index of key-value pairs on disk instead of keeping them in memory, does not seem to be able to able to retrieve values added this way: myDictObject[myStringList[0]] =…
hashim
  • 1
  • 1
0
votes
0 answers

Whoosh ValueError: Keys must increase

I'm getting the error ValueError: Keys must increase: '\x00\x00\x0b\xc4\x00\x01' .. '\x00\x00\x0b\xc4\x00\x01' at the line writer.commit() Where writer is an index.writer. I am adding 10K files to the index, and for performance, I commit every…
maged
  • 859
  • 10
  • 24
0
votes
1 answer

Getting started with Whoosh on Python

I am a complete newbie to to python as well as Whoosh. I need to create a search engine that allows me to search inside an XML file. For that, I have downloaded Whoosh and from the command prompt setup.py build setup.py install I then took a…
viggie
  • 183
  • 1
  • 3
  • 11
0
votes
1 answer

searching different models in haystack whoosh

i am using haystack and whoosh , but it only displays the results of one model , i made the index classes for the 3 models i need ,and i can choose the model i want to search in my templates , but only one model return results , the other models…
0
votes
1 answer

Include slashes and parentheses in tokens

Background I have search indexes containing Greek characters. Many people don't know how to type Greek so they enter something called "beta-code". Beta-code can be converted into Greek. For example, beta-code "NO/MOU" would be converted to "νόμου".…
0
votes
1 answer

How to write correct query for Haystack + Whoosh in Django

I have model: class Article(models.Model): title = models.CharField(max_length=250) slug = models.SlugField(max_length=250) text = models.TextField() date = models.DateTimeField(auto_now_add=True) And file search_indexes.py: from…
kusha
  • 1
  • 1
0
votes
3 answers

Django Haystack similarity search

I'm a Django newbie doing a primitive website. I installed haystack and Whoosh as its search engine cause it was the simplest thing to do. It works fine, but there is a problem and I don't know how to Google it. I have some categories on my site and…
darxsys
  • 1,560
  • 4
  • 20
  • 34
0
votes
1 answer

Whoosh - performance issues with wildcard searches (*something)

I'm noticing that searches like *something consume huge amounts of cpu. I'm using whoosh 2.4.1. I suppose this is because I don't have indexes covering this search case. something* works fine. *something doesnt't. How do you deal with these queries?…
Giuc
  • 125
  • 1
  • 8
1 2 3
24
25