2

I have a table posts:

CREATE TABLE posts (
  id serial primary key,
  content text
);

When a user submits a post, how can I compare his post with the others and find similar posts?
I'm looking for something like StackOverflow does with the "Similar Questions".

2 Answers2

5

While Text Search is an option it is not meant for this type of search primarily. The typical use case would be to find words in a document based on dictionaries and stemming, not to compare whole documents.

I am sure StackOverflow has put some smarts into the similarity search, as this is not a trivial matter.

You can get halfway decent results with the similarity function and operators provided by the pg_trgm module:

SELECT content, similarity(content, 'grand new title asking foo') AS sim_score
FROM   posts
WHERE  content  % 'grand new title asking foo'
ORDER  BY 2 DESC, content;

Be sure to have a GiST index on content for this.

But you'll probably have to do more. You could combine it with Text Search after identifying keywords in the new content ..

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

You need to use Full Text Search in Postgres.

http://www.postgresql.org/docs/9.1/static/textsearch-intro.html

Neil McGuigan
  • 46,580
  • 12
  • 123
  • 152