6 unicode characters can represent an 'apostrophe' in documents. It can be either u0027, u2018, u2019, u201B, u0091 or u0092
Out of the six, Elasticsearch recognises three unicode characters as 'apostrophe' : u0027, u2018 and u2019.
So, I think your apostrophe must be last 3 unicode character, which Elasticsearch assumes as word boundaries. So, citrus's will be tokenize as citrus only.
Adding a char_filter in your analyzer might help you. All the six characters will be replaced by proper 'apostrophe' u0027
curl -XPUT http://localhost:9200/index_name(your_index) -d '
{
"settings": {
"analysis": {
"char_filter": {
"mycharfilter": {
"type": "mapping",
"mappings": [
"\\u0091=>\\u0027",
"\\u0092=>\\u0027",
"\\u2018=>\\u0027",
"\\u2019=>\\u0027",
"\\u201B=>\\u0027"
]
}
},
"analyzer": {
"quotes_analyzer": {
"tokenizer": "standard",
"char_filter": [ "mycharfilter" ]
}
}
}
}
}'