0

As per the title I am looking for a method to search data on an equivalence basis

Ie user searches for a value of 20" it will also search for 20 inch, 20 inches etc...

I've looked at possibly using full text search and a thesaurus but would have to build my own equivalence library

Is there any other alternatives I should be looking at? Or are there common symbol/word equivalence libraries already written?

EDIT:

I dont mean the like keyword and wild cards

if my data is
A pipe that is 20" wide
A pipe that is 20'' wide - NOTE::(this is 2 single quotes)
A pipe that is 20 cm wide
A pipe that is 20 inch wide
A pipe that is 20 inches wide

I would like to search for '20 inch' and be returned
A pipe that is 20" wide
A pipe that is 20'' wide
A pipe that is 20 inch wide
A pipe that is 20 inches wide

Andrew J S
  • 86
  • 6
  • As an aside I have just set up a full text index and a minimal thesaurus (using the inch example) but the contains full text search doesnt like special characters such as " and ' (escaping doesnt work either) – Andrew J S Jul 20 '15 at 13:43
  • Ideally these two pieces of data would be in separate columns and handled via a lookup table. What you are struggling with a symptom of less than optimal database design. If you can't fix the tables you might be able to create a bolt on table that holds unit of measure aliases. So it might hold something like "", inch, inches, inch wide, inches wide. Then you can find a match based on the users input and include any of the other similar options. – Sean Lange Jul 20 '15 at 14:20
  • I agree, but unfortunately these are free text description fields, typically a lot longer that a few words, and these equivalents aren't necessarily related purely to units of measure (in this case it was the best and most difficult example I could use due to the special characters), others might include SS meaning stainless steel (which works perfectly with the thesaurus approach in full test search) – Andrew J S Jul 20 '15 at 14:41
  • try: https://en.wikipedia.org/wiki/Elasticsearch – Andrew Bickerton Jul 20 '15 at 14:56

1 Answers1

0

just answering this in case anyone else comes across it as I finally figured it out.

I ended up using an FTS thesaurus to assign equivalence to inch inches and ", and this work wonderfully for inch and inches but would return no results when I searched for 6"

It eventually turned out the underlying issue I had was that characters such as " are treated as word breakers by full text search.

I found that custom dictionary items seems to override the languages word breakers and so introducing a file called Custom0009.lex with a few lines of " and a few other characters/terms I wanted included that had word breakers in to C:\Program Files\Microsoft SQL Server\{instance name}\MSSQL\Binn and restarting the fdhost and rebuilding the index allowed my search for

select * from tbldescriptions where FREETEXT(MainDesc,'"')

or

select * from tbldescriptions where contains(MainDesc,'FORMSOF(Thesaurus,"""")')

notice the double " on the contains one as the search term is within " already it needed to be escaped to be seen.

Andrew J S
  • 86
  • 6