0

This question is related to my previous question :

error of finding distinct cobinations of muiltiple columns in IBM netezza SQL table

Now, I need to find some partial duplicated rows in the table in SQL IBM netteza Aiginity workbench.

The table is like :

id1   id2    **id3    id4  id5   id6**   id7  id8   id9
NY    63689  eiof   394  9761   9318   2846  2319  215
NY    63689  eiof   394  9761   9318   97614 648   645
CT    39631  pfef  92169 9418   9167   164   3494  34
CT    39631  pfef  92169 9418   9167   3649  7789  568

id3    id4  id5   id6  are duplicated for id1 = NY and id2 = 63689
id3    id4  id5   id6  are duplicated for id1 = CT and id2 = 39631

The result should be

 id1   id2    value
 NY    63689  2   
 CT    39631  2

UPDATE: I only need to count the partial duplicated for id3 id4 id5 id6 for each id1 and id2. I do not care the columns of id7, id8, id9.

I used the sql query:

 SELECT  id1,       id2,
  COUNT(*)   AS value
FROM
(
  SELECT
    id1, id2, id3, id4, id5, id6
  FROM
    myTable
  GROUP BY
    id1, id2, id3, id4, id5, id6
)
  AS uniques
GROUP BY
  id1, id2

But, I got:

   id1   id2    value
   NY    63689  number of combinations of id7  id8   id9   
   CT    39631  number of combinations of id7  id8   id9   

Any help would be appreciated.

Community
  • 1
  • 1
user3601704
  • 753
  • 1
  • 14
  • 46
  • What do you expect if you also have 2 rows of "NY 63689 eiof 111 2222 333 2846 2319 215"? I think you need a little bit more variety in your sample data to suss out what you want as a result. – ScottMcG Nov 19 '14 at 22:14

1 Answers1

3

The following query produces the output you want. Is this what you want to do?

SELECT  id1, id2, COUNT(*)   AS value
FROM myTable
GROUP BY id1, id2;

EDIT:

If you want complete duplicates (of all columns) but only to show the first two:

SELECT  id1, id2, COUNT(*) as value
FROM myTable
GROUP BY id1, id2, id3, id4, id5, id6;

You can add having count(*) > 1 if you only want examples with duplicates.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I have updated the OP and the problem may be a little different now. – user3601704 Nov 19 '14 at 19:51
  • @Gordon linoff, the second solution get the result of all columns including id7/8/9 in my new updated OP. this is not what I want. – user3601704 Nov 20 '14 at 03:21
  • @user3601704 . . . I have no idea what you mean. ids 7, 8, and 9 are not in either query. – Gordon Linoff Nov 20 '14 at 03:35
  • @GordonLinoff, the columns of id7/8/9 should not be counted in with id3/4/5/6 columns. In the example of OP, the for NY 63689, it should return 2 as duplicate rows for id3/4/5/6. – user3601704 Nov 20 '14 at 04:08