1

I need to write query that will find all items from the column Description that have duplicates that are the same or similar.

My current query will find all values that are exactly the same, but it must include similar values; for example SQL Sql and sql.

SELECT 
    Description, COUNT(*) AS Count_Of    
FROM
    Source 
GROUP BY
    [Description]  
HAVING 
    COUNT(*) > 1   

I know how to use LIKE to search the table for all items similar to something I define, can I apply it to this problem?

Any and all help is greatly appreciated, thank you.

--Editted 3/26/13

When I say similar, I mean more than case sensitive. I am working with company names and must account for people using different names such as Monsters Inc and Monsters Incorporated.

I would also like the output to display what the Description is so that I know what companies have redundancies in the database.

I have since taken care of case sensitivity with

SELECT
    LOWER (Description), COUNT(*)AS Count_Of

RESOLVED

I have a query to find all that are exact that repeat, and I also have a query that will find all like items for an item I specify.

What I did to solve it was running the first query and storing all the repeated items in a table, and then modifying the second query so when run it would find all similar items there where for each item in the table I just created.

Thank you very much to all that helped

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
FreakyDan
  • 559
  • 1
  • 7
  • 24
  • This might help: http://stackoverflow.com/questions/3299330/sql-is-it-possible-to-group-by-according-to-like-functions-results – Kittoes0124 Mar 25 '13 at 22:03
  • 2
    It would help if you define what "similar" means. Your example (SQL, Sql, sql) could be solved with a case-insensitive search, but that might not be what you really mean. – Pondlife Mar 25 '13 at 22:08

3 Answers3

1

If you only mean that you wish to carry out a case INsensitive comparison then simply specify the appropriate case-insensitive collation as part of your GROUP BY clause.

You could, for example, use the following:

SELECT 
    Description COLLATE SQL_Latin1_General_CP1_CI_AS,
    COUNT(*) AS Count_Of    
FROM
    Source 
GROUP BY
    [Description] COLLATE SQL_Latin1_General_CP1_CI_AS
HAVING 
    COUNT(*) > 1 
Martin
  • 16,093
  • 1
  • 29
  • 48
  • Thank you for letting me know about that, my use of LOWER simply destroys the case whereas your method preserves it – FreakyDan Mar 28 '13 at 19:55
1

Depending on what "similar" means, you can find SOUNDEX useful:

http://www.techonthenet.com/oracle/functions/soundex.php

If not, what do you mean by similar ?

Daniel
  • 21,933
  • 14
  • 72
  • 101
  • SOUNDEX seems very interesting. While I didn't use it this time I will have to play with it in the future. Thank you for informing me about it – FreakyDan Mar 28 '13 at 19:51
1

You can use Group By + CASE WHEN to group on similar values, but needs some cubersome work to do, for example:


 SELECT 
    CASE WHEN DESCRIPTION LIKE '%ONE%' THEN 'LIKEONE'
         WHEN DESCRIPTION LIKE '%TWO%' THEN 'LIKETWO'
         ELSE 'LIKEOTHER'END , COUNT(*) AS Count_Of    
FROM
    Source 
GROUP BY
    CASE WHEN DESCRIPTION LIKE '%ONE%' THEN 'LIKEONE'
         WHEN DESCRIPTION LIKE '%TWO%' THEN 'LIKETWO'
         ELSE 'LIKEOTHER'END 
HAVING 
    COUNT(*) > 1 
ljh
  • 2,546
  • 1
  • 14
  • 20