We have documents in different languages. To be able to search in documents of different languages we created an index with one field per language. We make sure we fill the applicable field depending on the language of the document. (Other fields will be empty) We do not know in which language is searched, so we make sure to search all fields, so the applicable fields are always searched.
We have an issue when users supply a search query containing noise/stop words. It seems it removes perfectly valid search results from the result set when we use the searchMode=all and use a language analyzer. We have for instance the following text in our index to test this behavior: A document title with the and it in the name
When we use the following search query we get the expected search result: search=document title name&QueryType=full&searchMode=all&$count=true
However, when we try to search the exact title (or even add a few of the noise words like with, the and in) the results are not returned when we use the en.microsoft analyzer. When we use another language analyzer (which uses other noise/stop words) the results are returned. We have similar results using the nl.microsoft analyzer when using a dutch index and try to search for text which also contains dutch noise/stop words like "bij", "in" or "en" while this is part of the indexed text.
Is there some way to resolve this issue? Is this a bug in the search when using language analyzers? I would assume if we create a search query which searches an index which filtered noise/stop words, the noise/stop words would also be removed from the query by cognitive search before executing the search query.
Note: We also found the following stackoverflow post: Queries with stopwords and searchMode=all return no results It seems the issue only occurs when we search multiple fields with different languages. I can confirm this. If I test the search query by only searching the english field using the following query we get the expected result: search=document title name&QueryType=full&searchMode=all&searchFields=Title_enus&$count=true
However, when I try to search two fields which use a english and dutch language I do not get the english result anymore: search=document title name&QueryType=full&searchMode=all&searchFields=Title_enus,Title_nlnl&$count=true
Our actual situation is slightly different as in this post, since we search in multiple fields using an OR clause. I'll update this post if I did some more testing and can provide the exact test queries including their results.