0

I'm storing papers in SQL Server 2005 and am looking for a way to paste in the text of a paper and then search for potential plagiarism (copied content) in the database.

What's the best way to go about this? Is there a way to get a gauge for the extent to which something is similar to something else using full-text indexing, for several paragraphs of content?

skaffman
  • 398,947
  • 96
  • 818
  • 769
Caveatrob
  • 12,667
  • 32
  • 107
  • 187

2 Answers2

1

why don't you install google desktop and have it only index that one directory

then you can have google do the indexing for you

patrick
  • 16,091
  • 29
  • 100
  • 164
  • I'm intrigued by your answer -- should I export everything from SQL text fields to a folder? – Caveatrob Mar 27 '09 at 00:32
  • if you can export it to a text file, then google desktop can parse it. seems like it would work fine. – patrick Mar 27 '09 at 14:47
  • i actually thought you had a bunch of text files in a folder that you were loading into sql, so if you had a bunch of txt files in a folder you wouldnt need to do anything at all, just point google desktop at it. – patrick Mar 27 '09 at 14:49
0

This is not really the sort of problem that full-text indexing in SQL Server is designed to solve. There's nothing built in to SQL Server that you can really use to help with this.

There are a number of specialised plagiarism detection tools, which a Google search will turn up for you. That's probably your best bet.

David M
  • 71,481
  • 13
  • 158
  • 186