0

I have a query (a view actually) the normally should define a unique mapping from key to value2.

select key, value1, value2
from tbl -- joins are left out here
where (1=1) -- left out here
order by key
;

and usually the result is something like

key       value1  value2
--------------------------
Harry     fish    blue
Sally     dog     green
Willy     dog     red

so I can map key to value2 uniquely.

But as it happens the underlying data may contain junk some day:

key       value1  value2
--------------------------
Harry     fish    blue
Harry     fish    black     -- <<< breaks uniqueness
Sally     dog     green
Willy     dog     red

I would like to add HAVING and/or GROUP BY expressions to identify the nun-unique rows easily. Because I have to do something with value2 in this case I have to aggregate it somehow -- my idea is max.

Therefore an example result would be

key       value1  value2   count_value2
----------------------------------------
Harry     fish    blue     2
Sally     dog     green    1
Willy     dog     red      1

But as always, I am completely baffled by GROUP BY and its consorts. Where do I but it? Where and where the aggregations?

towi
  • 21,587
  • 28
  • 106
  • 187
  • how do we know that Harry's black fish is junk while Harry's blue fish is ok ? Which database and version is used ? – t-clausen.dk Jun 21 '12 at 11:16

3 Answers3

1

something like this?

select 
  key, 
  min(value1) as "a value1", 
  min(value2) as "a value2", 
  max(value2) as "another value2", 
  count(distinct value2) "number of value2"
FROM tbl
group by key
having count(distinct value2) > 1
Stefan Steinegger
  • 63,782
  • 15
  • 129
  • 193
  • Right! I had to modify it a bit, because I forgot to mention a `distinct` complication, but with your hints I managed it. Basically I now have `select key, v1, min(v2), max(v2), count(*) from ( select distinct key, v1, v2 from tbl ) group by key, v1;`. The having filter I do not want to apply here, because I want to *see* a bad line. – towi Jun 21 '12 at 11:48
1

Closer to solution:

select tbl.*,count(distinct value2) over (partition by key) from tbl

Or with a group by

select key,count(distinct value2) from tbl group by key having count(distinct value2)>1
Nahuel Fouilleul
  • 18,726
  • 2
  • 31
  • 36
  • I agree mostly with this answer. – Oleg Danu Jun 21 '12 at 11:31
  • whoo, I can't tell. I went for `group by`, see the other answer, because it was closer to what I had in mind. but you are probably right and I believe you. – towi Jun 21 '12 at 11:45
0

Please try this and let me know this is what you are looking for

select C.key,C.value1,C.value2,A.count_value2 from
(select key,value1,COUNT(*) count_value2
from tbl
group by key,value1)A
join
(select B.key,B.value1,B.value2 from
(select *,ROW_NUMBER() over(partition by T.key,T.value1 order by T.key )  rn 
from tbl T 
group by T.key,T.value1,T.value2)B
where B.rn=1)C
on C.key=A.key
and C.value1=A.value1
Joe G Joseph
  • 23,518
  • 5
  • 56
  • 58
  • *gulp* partitions... joins... subselects... I think I got a simpler solution, see other answer. but since you are the second person mentioning `partition` to me I think I will look into that. – towi Jun 21 '12 at 11:56