Questions tagged [full-text-indexing]

indexing for full-text search

569 questions
1
vote
2 answers

What is a good choice for Fulltext indexing when developing a OSX application?

Hy, I'm implementing an IMAP client as a Mac OSX application using MacRuby. For the sake of offline availability, I wanted to allow fulltext indexing and attribute based indexing of all messages. Attributes include common E-Mail stuff like from:,…
Overbryd
  • 4,612
  • 2
  • 33
  • 33
1
vote
1 answer

suggestions on fulltext search or already existing search algorithms

Can someone suggest how to solve the below search problem easily, I mean is there any algorithm, or full text search will be suffice for this? There is below classification of items…
user867662
  • 1,091
  • 4
  • 20
  • 45
1
vote
1 answer

MySQL Match Against: easiest way to deal with stopwords

I get zero results from this query: SELECT COUNT(1) FROM `myTable` WHERE MATCH(tagline) AGAINST(' +IT professional' IN BOOLEAN MODE) I get 92 from this: SELECT COUNT(1) FROM `myTable` WHERE `tagline` LIKE '%IT professional%' I prefer the first…
Ned Hulton
  • 477
  • 3
  • 12
  • 27
1
vote
2 answers

Neo4j Full Text Index: Term(s) found

I'm using Neo4j's db.index.fulltext.queryNodes and getting nice results using wildcards. This uses the Lucene 5.5.5 but may not implement all its functionality. Is there are way to return the specific term that was found within the text being…
David A Stumpf
  • 753
  • 5
  • 13
1
vote
0 answers

ElasticSearch tutorial: Getting ValueError from bulk_indexing

I am following this tutorial. https://medium.com/free-code-camp/elasticsearch-with-django-the-easy-way-909375bc16cb#.le6690uzj Tutorial is about using elasticsearch with django app. I am stuck when it ask to use bulk_indexing() in shell. I am…
1
vote
1 answer

Janusgraph not able to find suitable index for a index enabled property key

I'm working on a Janusgraph application. To improve gremlin query performance we are creating two mixed indexes, one for vertices and one for edges. Now Janusgraph can query indexes for property keys that are created and indexed at the time of index…
1
vote
1 answer

substring search with CTXCAT index doesn't work even after enabling substring index during index creation

I am trying to create a text index (CTXCAT) on a column of an Oracle DB table. I have set substring search to TRUE when creating the index. But after the index is created, when I do a substring search I don't see any results. DDL of the table " …
user1851006
  • 365
  • 1
  • 4
  • 12
1
vote
0 answers

Is there any way to overcome the limits of tsvector and tsquery in postgres full text search

I am using pdf of more than 80 pages, extracting the text from pdf and converting it to tsvector. but I am getting duplicate positional valueswhen it exceeds to limit 'querytre':10668,10719,10723,16383 'quest':16383 'question':16383…
SHIV
  • 11
  • 1
1
vote
2 answers

Find n characters after a specific string in Python

I have a webpage's source. It's just a ton of random numbers and letters and function names, saved as a string in python3. I want to find the text that says \"followerCount\": in the source code of this string, but I also want to find a little bit…
vladusatii
  • 82
  • 1
  • 8
1
vote
3 answers

Fulltext indexes vs pattern_ops indexes

I am using django, and all of my queries are created by django, so i have no handwritten queries... I have a table of BillRecords, which has a field subscriberno . In my django filters, i use a filtering query…
Mp0int
  • 18,172
  • 15
  • 83
  • 114
1
vote
1 answer

RediSearch - searching for particular word which occurs in many records take long time. How to improve it?

I have addresses database (as hashes) with about 30 millions records. I was adding text index to all addresses fields. Searching looks ok until I want to search word which occur in many records. For example searchin word "London" which occur in…
1
vote
2 answers

Arabic text files searching and indexing

I am working on a project of an electronic library (for Arabic books). A program that allows the user to import his books into the systems library and perform searching against his library. The system is delivered to the user with a basic library…
1
vote
0 answers

Querying lucene index with arbitrary long article text to check for all matches within article (through neo4j)

I'm trying to query the lucene index I've added to a neo4j field (it's a "name" field, that isn't very long, one to ten words at most). What I do right now is take all the text in a given webpage, sanitize it with a javascript function to keep only…
doei
  • 77
  • 1
  • 7
1
vote
1 answer

MYSQL INNODB FULLTEXT INDEX

I am dealing with FULLTEXT Indexes in a big database ( with lot of update and insert every day) So MYSQL est taking high CPU USAGE! I have a question : Is there a way to rebuild indexes only one time per day? not every update/insert Thank you
AmenzO
  • 409
  • 7
  • 19
1
vote
1 answer

Full-Text Search on VARBINARY - viewing the actual text content

I'm experimenting with SQL Server full-text search on binary documents such as pdf, doc, docx, rtf. SQL Server seems to be able to read the text perfectly as I've succeeded in getting expected search results for text searches inside the VARBINARY…
Adam
  • 1,932
  • 2
  • 32
  • 57