Questions tagged [pylucene]

PyLucene is a Python extension for accessing Java Lucene. Its goal is to allow you to use Lucene's text indexing and searching capabilities from Python.

PyLucene is a Python extension for accessing Java Lucene. Its goal is to allow you to use Lucene's text indexing and searching capabilities from Python.

PyLucene is not a Lucene port but a Python wrapper around Java Lucene. PyLucene embeds a Java VM with Lucene into a Python process.

128 questions
0
votes
1 answer

Install Pylucene on Windows, "Make" command does not work

I was trying to setup the Pylucene on Windows10, the first few steps are successful, but after I make edits in my…
David
  • 63
  • 5
0
votes
1 answer

Trouble with Make and Make install for PyLucene

I'm trying to install PyLucene 8.1.1 on OSX 10.13.6, Python 2.7, Java 1.6. My makefile is as follows: VERSION=8.1.1 LUCENE_VER=8.1.1 PYLUCENE:=$(shell…
Egg
  • 9
  • 1
0
votes
2 answers

Pylucene installation on MacOs with adoptopenjdk / Java8

I am trying to install JCC (as part of the installation of PyLucene) and I encountered several issues with it. The python version I use is 3.7, and I have installed adoptopenjdk-8.jdk using brew cask (since Java-8 is no longer available without…
HDolev
  • 193
  • 1
  • 9
0
votes
2 answers

Using PyLucene as a K-NN Classifier

I have a dataset composed of millions of examples, where each example contains 128 continuous-value features classified with a name. I'm trying to find a large robust database/index to use to use as a KNN classifier for high-dimensional data. I…
Cerin
  • 60,957
  • 96
  • 316
  • 522
0
votes
1 answer

JCC installation: Java JDK dictionary does not exist

I have installed the source code for PyLucene which contains the JCC source code. When trying to run python setup.py build in the JCC directory I receive the following error: Java JDK directory 'c:/Program Files/Java/jdk1.6.0_18' does not…
guruman
  • 25
  • 6
0
votes
1 answer

How can one retrieve a particular field from all the indexed documents using PyLucene?

In java it could be done using "MatchAllDocsQuery()", but there is no documentation for Pylucene that mentions how could it be done. This is the python code to post individual queries and then extract all the fields from the retrieved documents.…
PinkBanter
  • 1,686
  • 5
  • 17
  • 38
0
votes
1 answer

How to create CustomSimilarity Class using pylucene?

In Java, a custom similarity scoring function is created by extending the SimilarityBase Class and overriding the scoring method. However, I cannot find a way to do the same using pylucene. I have tried extending the SimilarityBase class the same…
0
votes
0 answers

Error installing Pylucene on Windows

I'm having trouble installing Pylucene on my Windows 10 machine. I have edited the makefile as recommended: PREFIX_PYTHON=C:\\Users\\xxxxx\\Programs\\Python\\Python36 ANT=C:\\apache-ant-1.10.1-bin\\apache-ant-1.10.1\\bin\\ant JAVA_HOME=C:\\Program…
user2274879
  • 349
  • 1
  • 5
  • 16
0
votes
1 answer

Issue with installing PyLucene 6.5.0 on Linux

I recently moved to python3, so I'm trying to install the recent version of Pylucene (version 6.5.0) which is compatible with python3. jcc3/sources/jcc.cpp: In function ‘PyObject* t_jccenv_strhash(PyObject*, PyObject*)’: jcc3/sources/jcc.cpp:214:27:…
amin
  • 445
  • 1
  • 4
  • 14
0
votes
1 answer

Configuring Lucene Index writer, controlling the segment formation (setRAMBufferSizeMB)

How to set the parameter - setRAMBufferSizeMB? Is depending on the RAM size of the Machine? Or Size of Data that needs to be Indexed? Or any other parameter? could someone please suggest an approach for deciding the value of setRAMBufferSizeMB.
N.Dinesh.Reddy
  • 522
  • 2
  • 7
  • 15
0
votes
1 answer

What does optimize method do? Alternatives for optimize method in latest versions of lucene

I am pretty new to lucene I am trying to understand the segment merging process. I came across the method optimize(which will merge all the available Lucene index segment at that instance). My exact question is, Does Optimize merges all the levels…
N.Dinesh.Reddy
  • 522
  • 2
  • 7
  • 15
0
votes
0 answers

Proper way of organizing indexWriter and indexSearcher in lucene

I'm using pylucene to build and search through an inverted text index. I built this class (don't be afraid of the python code, pylucene exposes the same functions as in java): import os, re, sys, lucene from java.nio.file import Paths from…
user3091275
  • 1,013
  • 2
  • 11
  • 27
0
votes
1 answer

Lucene does not index some terms in documents

I have been trying to use Lucene to index our code database. Unfortunately, some terms get omitted from the index. E.g. in the below string, I can search on anything other than "version-number": version-number "cAELimpts.spl SCOPE-PAY:10.1.10…
n.jmurov
  • 123
  • 2
  • 12
0
votes
1 answer

How to avoid attachCurrentThread exception when using pylucene in flask?

I built a simple wrapper service around a class that reads and queries a Lucene index using pylucene (6.5). I get the following error when running the server: RuntimeError: attachCurrentThread() must be called first I assume that the problem stems…
Aleksandar Savkov
  • 2,894
  • 3
  • 24
  • 30
0
votes
1 answer

pyLucene - How to use BM25 similarity instead of tf-idf

As I understand pyLucene now offers BM25 similarity also. I am using pyLucene - 4.10.1, but can't find any example as to how to use BM25 instead of tf-idf. Please guide.
Dreams
  • 5,854
  • 9
  • 48
  • 71
1 2 3
8 9