In Azure Search, on field with values like "12-10-3" or "30-843-44", I have setup a custom tokenizer to replace the dashes with empty string.
I now want to do an "ends with" regex search but cannot get it to do quite what I want.
For example, to find codes ending in 3 I have tried:
searchMode=any&queryType=full&search=code:/(.*)3/
This returns, say, "12-10-3" but also ones like "30-843-44".
I then tried:
searchMode=any&queryType=full&search=code:/(.*)3[^<0-9>]*/
But this seems to give the same result. I have been trying to go through the regex syntax referenced in the Azure Search docs here.
When I test my tokenizer on "123-456-78", it seems to be working, so I don't understand why the regex search is not working correctly.
"tokens": [
{
"token": "12345678",
"startOffset": 0,
"endOffset": 10,
"position": 0
}
]
Any ideas?
Update:
The tokenizer is applied in C# as follows:
var myIndexDefinition = new Index()
{
Name = "MyIndex",
Analyzers = new[]
{
new CustomAnalyzer
{
Name = "code_with_dash_analyzer",
Tokenizer = TokenizerName.Keyword,
CharFilters = new CharFilterName [] { "dash_to_empty_mapper" }
}
},
CharFilters = new List<CharFilter>
{
new MappingCharFilter("dash_to_empty_mapper", new[] { "- => " })
},
Fields = new[]
{
// Field with the dash in the values
new Field("codes", DataType.String) { IsRetrievable = true, IsSearchable = true, IsSortable = true, IsFilterable = true, IsFacetable = true },
//.... other field definitions....
}
}