1

Recently we have added Lucene(2.4.1) support to our application which worked with Jackrabbit(1.6.2). We have done all like it was described in jackrabbit tutorial. And all works almost fine. But I noticed some strange behavior and can't find any docs about it. I decided to ask you about it.

For example: I have following text in Node(jcr:content) in jcr:data property

The quick brown fox jumps over the lazy dog 
!@#$%^& 
travmik! 
tra!vmik

My XPath query is the following:

String query = "root/element(*,my:documentBody)
                        [jcr:contains(*/*/element(*),'*" + param +"*')]";

Then I try to search:

"q", "qu", "qui", "quic", "quick", "k", "ck", "ick", "uick", "quick brown fox", "quick fox", "tra", "travmik", "mik" - all found ok

"tra!vmik", "travmik!", "!@#$" - nothing

And, yes I escaped all special characters from this.

What did I do wrong?

P.s. I have one more question - in Lucene docs says that "You cannot use a * or ? symbol as the first character of a search", but I use and it works. Why?

Kev
  • 118,037
  • 53
  • 300
  • 385
travmik
  • 71
  • 1
  • 7

1 Answers1

0

I found the problem. It was some misunderstanding with Extractors which are used in jackrabbit for indexing content. I don't want to go into details, but can say that this piece of code from one of Extractors is the cause of all my problems:

if (!Character.isLetterOrDigit(c)) {
    if (!space) {
        space = true;
        buffer.append(' ');
        continue;
    }
    continue;
}

If someone is interested in this - I can explain in greater detail.

Julio
  • 6,182
  • 11
  • 53
  • 65
travmik
  • 71
  • 1
  • 7
  • I am also trying to implement lucene full text search but somehow it's not working ,can you help me with this https://stackoverflow.com/questions/56655489/lucene-index-getting-empty-result-while-query – sandy Jan 13 '20 at 17:59