I have a problem and Google hasn't helped me much. I'm trying figure out a way to ignore HTML while searching a Solr index in ColdFusion (9).
For example, if I search for microsoft
and my index contains Microsoft© makes Windows®
I'm prompted to search for "Microsoft© makes Windows®" rather than showing the actual result.
As you can see below, I'm just passing the string into the criteria property of cfsearch - but again - doing this produces (what I consider to be) a "dirty" result.
<cfsearch
collection="mycollection"
criteria="microsoft"
name="results"
maxrows="100"
suggestions="always"
contexthighlightbegin="<strong>"
contextHighlightEnd="</strong>"
contextPassages="3"
/>
I've been looking at the documentation for Solr's query syntax but I don't see anything that jumps out at me on how to avoid this problem.
Should I look at providing the index a "flat" version of text or is there a way to avoid HTML strings such as © / ® / ™
?
I'm open to suggestions.
-- Brian.