4

I need to input "face" and get "facial, faces, faced, facing, facer, faceable" etc.

I've come across some ineffective programs which do the opposite, such as SNOWBALL and a couple of Porter Stemming PHP scripts which don't seem to work.

I'm beginning to think I may have to write this script - But, I thought I'd check to see if somebody has already been there/done that.

user734063
  • 569
  • 1
  • 5
  • 13
  • 2
    the best way is to use a dictionary – Ibu May 07 '11 at 23:02
  • A porter stemmer algorithm would help you reduce complex words to their base stem (e.g. "facial" to "face") but I don't know about branching out to derivative based on that. You would need quite a large database of words, I think. – Will Martin May 07 '11 at 23:05
  • possible duplicate of ["Opposite" of Porter Stemmer algorithm?](http://stackoverflow.com/questions/5207116/opposite-of-porter-stemmer-algorithm) – Lightness Races in Orbit May 07 '11 at 23:07
  • There is an exact duplicate of this question at the top of the "Related" questions list just to the right. This would also have appeared whilst you wrote your question. – Lightness Races in Orbit May 07 '11 at 23:08
  • So, apparently is doesn't exist! Well - time to get to work. The reverse can be accomplished with a vast dictionary database and an extensive library of lexicographical rules, prefixes, endings etc. – user734063 May 07 '11 at 23:08
  • Ah, I see. That question didn't turn up in search results. Well, I'll get back when I've found a solution - The other question does not contain a solution. – user734063 May 07 '11 at 23:10
  • no use a webservice rather than try to create that database – Ibu May 07 '11 at 23:10

1 Answers1

0

It will be very hard to simply find an algorithm to find the different way a word can be written like that.

You can use a dictionary webservice instead that have all the words available already

Ibu
  • 42,752
  • 13
  • 76
  • 103
  • Cool I don't quite understand the mechanics of the dic websvc but it look promising – user734063 May 08 '11 at 17:19
  • sadly none of the dictionary webservices seem to have "derived forms" as a function - The closest I've found is a regular website `http://www.wordwebonline.com` which contains 'derived forms' at the bottom. But it doesn't include adverbs or other forms. – user734063 May 08 '11 at 17:29
  • I've found one solution - You can place your word list in Excel, in my case 14,000 words. Sort your list with the filter/contains, does not contain etc. Separate words into columns, find/replace for "es" and "s" and "ed" etc. Then write a script to check all words against a large dictionary and keep only the ones which exist. – user734063 May 09 '11 at 18:17