1

i am using regexp character filter in couchbase for my analyzer. desirable result following

phuong 1 -> phuong_1
phuong  12 -> phuong_12

Configuration character filter in Couchbase Web Console following

Regular expression : ([a-z])\s+(\\d)
Replacement: $1_
  • Result of above configuration is produce term [phuong,1, 12 ]
  • Desirable result is [phuong_1 , phuong_12]
  • I have aligned this code many times But it still not working correct
  • Can you help me this problem ?
Matthew Groves
  • 25,181
  • 9
  • 71
  • 121
nhtrung
  • 55
  • 7

1 Answers1

3

Couchbase's Full text search is implemented in golang. Here's a playground illustration of how your regular expression works ..

https://play.golang.org/p/Jray7DTYZam

As you can see in the illustration above, $1x is equivalent to ${1x}, not ${1}x. So your replacement needs to be updated to ${1}_.

Now this said, we have a limitation that variables ($1, ${2} etc.) aren't supported at the moment. I've created an internal ticket to extend support for this.

Abhi
  • 81
  • 2
  • thank you for feedback my question . Based on your answer ,i think replacement should be ${1}_${2} – nhtrung Mar 17 '20 at 02:43
  • Although the results were correct, couchbase still did not produce the correct terms such as phuong_1 , phuong_12 . I test in elasticsearch ,it's working but couchbase not Any problem with couchbase ? – nhtrung Mar 17 '20 at 02:57
  • Looks like you found an issue! Check out https://issues.couchbase.com/browse/MB-38312 and https://github.com/blevesearch/bleve/pull/1351 :) – Matthew Groves Mar 17 '20 at 13:42