2

I have a SQL Server database table (DuplicateIds) that contains the ID's of duplicate words from another table (Words). Here's and example of the data in the table DuplicateIds:

        word_id  |  word
----------------------------------
        244      |  ape
        603      |  ape
       1873      |  ape
        372      |  banana
       3095      |  banana

...and so on. Often there are only two or three duplicates but there are cases with 10 or even more duplicates.

Now I want to use this table with duplicates to create a new table that connects the ID's of the words that are the same. I guess the new table would look something like this:

        word_id  |  connected_id
----------------------------------
        244      |    603
        244      |   1873
        603      |    244
        603      |   1873
       1873      |    244
       1873      |    603
        372      |   3095
       3095      |    372

With this table I can look up a certain word using its ID and get the ID's of all the words that are the same.

Now I wonder if I can write a (T)SQL statement that generates this new connection table for me, using the data from DuplicateIds?

tkahn
  • 1,407
  • 2
  • 21
  • 37

2 Answers2

4

This should do it:

SELECT
   di.word_id
  ,di2.word_id  connected_id
 into NewTable
 from DuplicateIds di
  inner join DuplicateIds di2
   on di2.word = di.word
    and di2.word_id <> di.word_id
Philip Kelley
  • 39,426
  • 11
  • 57
  • 92
  • Hi! Does this solution require that I make a duplicate of the table DuplicateIds first (I'm referring to the di2 table in your example)? – tkahn May 31 '12 at 13:55
  • Nevermind - I just saw that those where aliases. It seems to work great - a table with 2534 new records was created! – tkahn May 31 '12 at 14:00
0

Try this. I am not sure how not equal is done in sql server.

 INSERT INTO DuplicateIds 
 SELECT a.word_id, b.word_id  connected_id
 from Words a,Words b
 where a.word=b.word
 and a.word_id <> b.word_id
ejb_guy
  • 1,125
  • 6
  • 6