1

I'm trying to group by a primary column and a secondary column. I want to ignore NULL in the secondary column unless it's the only value.

CREATE TABLE #tempx1 ( Id INT, [Foo] VARCHAR(10), OtherKeyId INT );
INSERT INTO #tempx1 ([Id],[Foo],[OtherKeyId]) VALUES
(1, 'A', NULL),
(2, 'B', NULL),
(3, 'B', 1),
(4, 'C', NULL),
(5, 'C', 1),
(6, 'C', 2);

I'm trying to get output like

Foo OtherKeyId
A   NULL
B   1
C   1
C   2

This question is similar, but takes the MAX of the column I want, so it ignores other non-NULL values and won't work.

I tried to work out something based on this question, but I don't quite understand what that query does and can't get my output to work

-- Doesn't include Foo='A', creates duplicates for 'B' and 'C'
WITH cte AS (
    SELECT *,
    ROW_NUMBER() OVER (PARTITION BY [Foo] ORDER BY [OtherKeyId]) rn1
    FROM #tempx1
)
SELECT c1.[Foo], c1.[OtherKeyId], c1.rn1
FROM cte c1
INNER JOIN cte c2 ON c2.[OtherKeyId] = c1.[OtherKeyId] AND c2.rn1 = c1.rn1

This is for a modern SQL Server: Microsoft SQL Server 2019

Dale K
  • 25,246
  • 15
  • 42
  • 71
BurnsBA
  • 4,347
  • 27
  • 39

3 Answers3

1

Hmmm . . . I think you want filtering:

select t.*
from #tempx1 t
where t.otherkeyid is not null or
      not exists (select 1
                  from #tempx1 t2
                  where t2.foo = t.foo and t2.otherkeyid is not null
                 );
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

You can use a GROUP BY expression with HAVING clause like below one

SELECT [Foo],[OtherKeyId]
  FROM #tempx1 t
 GROUP BY [Foo],[OtherKeyId]
HAVING SUM(CASE WHEN [OtherKeyId] IS NULL THEN 0 END) IS NULL
    OR ( SELECT COUNT(*) FROM #tempx1 WHERE [Foo] = t.[Foo] ) = 1

Demo

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
0

My actual problem is a bit more complicated than presented here, I ended up using the idea from Barbaros Özhan solution to count the number of items. This ends up with two inner queries on the data set with two different GROUP BY. I'm able to get the results I need on my real dataset using a query like the following:

SELECT
    a.[Foo],
    b.[OtherKeyId]
FROM (
    SELECT
        [Foo],
        COUNT([OtherKeyId]) [C]
    FROM #tempx1 t
    GROUP BY [Foo]
) a
JOIN (
    SELECT
        [Foo],
        [OtherKeyId]
    FROM #tempx1 t
    GROUP BY [Foo], [OtherKeyId]
) b ON b.[Foo] = a.[Foo]
WHERE
    (b.[OtherKeyId] IS NULL AND a.[C] = 0)
    OR (b.[OtherKeyId] IS NOT NULL AND a.[C] > 0)
BurnsBA
  • 4,347
  • 27
  • 39