0

In sites like Quora or Stack Overflow, on a particular question, they are able to show questions that are possibly related. A quick look at them would reveal that they are merely looking for questions with similar text content. Is there any standard technique to find out such similar texts from a DB table where all texts are stored?

For example if you go to this question - How to remove Application icon from Action Bar in Android? it shows the following question as related - Remove application icon and title from Honeycomb action bar .

If I have column questionText, where the questions texts are stored, in a table questions, how will I find out such related strings?

Community
  • 1
  • 1
jaibatrik
  • 6,770
  • 9
  • 33
  • 62

1 Answers1

0

You'll need to extract keyword tokens from your documents and them find correlation between them. You can use a FOSS tool like Apache Lucene to do that for you.

They can rely on tags also.

lnrdo
  • 396
  • 1
  • 13