3

Is there a free available list of the most common english words to remove from text for creating a search index?

Dave
  • 41
  • 2

2 Answers2

2

Wikipedia gives the 100 most frequent lemmas: http://en.wikipedia.org/wiki/Most_common_words_in_English

That might be good for a start; the article provides some good references.

Hans W
  • 3,851
  • 1
  • 22
  • 21
2

Here are the ones (plus characters) used in SQL Server 05 noiseword list, i assume the 08 stopwords are simular.

And the MSDN on it here

Hope this helps

Jammin
  • 3,050
  • 2
  • 23
  • 34