4

I have got a table in SQL Server 2008 where I need alternating values for one column, say column alt. Duplicates in that column always need the same value, hence I was thinking about using the dense_rank function for this column alt via % 2.

But there are also zip codes in that table, that I need to order the data by before assigning the alternating values.

So basically after the alternating values based on column alt have been assigned when the data is then ordered by zip code the alternating values really need to be alternating (apart from the duplicates in the 'alt' table of course).

Currently I get a result where the alt values do get alternating values, but when ordering by zip codes, I have sequences of e.g. 0,0,0 via the dense_rank function that are the problem.

I tried using a temp table, but didn't get the expected result with a

select * into #txy ordered by zip 

and then doing the desk_rank on that table because the order of a temp table isn't guaranteed.

Any ideas are greatly appreciated!

Cheers, Stevo

Edit:

Sample Code:

CREATE TABLE [xy.TestTable](
[BaseForAlternatingValue] [char](10),
[zip] [varchar](5)
) ON [PRIMARY]
GO


INSERT INTO [xy.TestTable]
       ([BaseForAlternatingValue]
       ,[zip])
 VALUES
       ('cccccccccc','99999'),
       ('bbbbbbbbbb','22222'),
       ('aaaaaaaaaa','12345'),
       ('dddddddddd','33333'),
       ('aaaaaaaaaa','12345'),
       ('bbbbbbbbbb','22222')
GO

select (DENSE_RANK() OVER (ORDER BY BaseForAlternatingValue)) % 2 as AlternatingValue
    , BaseForAlternatingValue
    , zip
    from [xy.TestTable]
    order by zip


Result:
AlternatingValue    BaseForAlternatingValue zip
1                      aaaaaaaaaa            12345
1                      aaaaaaaaaa            12345
0                      bbbbbbbbbb            22222
0                      bbbbbbbbbb            22222
0                      dddddddddd            33333
1                      cccccccccc            99999

The Problem now is that when ordered by zip code the following columns both contain the same value (0) as alternating value. When ordered by zip code the result should really have alternating values, but these alternating values should be based on the column BaseForAlternatingValue.

0                      bbbbbbbbbb            22222
0                      dddddddddd            33333

The expected outcome should be:

AlternatingValue    BaseForAlternatingValue zip
1                      aaaaaaaaaa            12345
1                      aaaaaaaaaa            12345
0                      bbbbbbbbbb            22222
0                      bbbbbbbbbb            22222
1                      dddddddddd            33333
0                      cccccccccc            99999

The last AlternatingValue of the last two result rows is different: the Alternating Value needs to alternate between different zip codes. Before it was 0 for the third last row and also 0 for the second last row.

As for Mikael's question below, "And what if you have add row ('cccccccccc','12345'). What would the expected output be then?"

The expected output would then be:

AlternatingValue    BaseForAlternatingValue zip
1                      aaaaaaaaaa            12345
1                      aaaaaaaaaa            12345
0                      cccccccccc            12345
1                      bbbbbbbbbb            22222
1                      bbbbbbbbbb            22222
0                      dddddddddd            33333
0                      cccccccccc            99999

So in summary: I need alternating values for the column BaseForAlternatingValue, but this alternating should be visible when ordering by zip code. (and duplicates in BaseForAlternatingValue need the same "alternating" value)

----------------

In the end I found a simpler and relatively nice solution: 1) using a temp table with an insert into and order by and using id values (id values will reflect the order by clause) 2) finding out the smallest id for a given BaseForAlternatingValue 3) finding out the count of distinct BaseForAlternatingValues with an id smaller than that

spse
  • 284
  • 2
  • 11
  • 2
    Please add sample data and expected output to your question. I at least does not understand your description. – Mikael Eriksson Jul 19 '12 at 07:58
  • @MikaelEriksson: you're not alone ..... – marc_s Jul 19 '12 at 08:01
  • Still missing the what the expected output is meant to be. And what if you have add row `('cccccccccc','12345')`. What would the expected output be then? – Mikael Eriksson Jul 19 '12 at 08:30
  • Hi Mikael, I have added the expected outcome above in my original post. – spse Jul 19 '12 at 08:34
  • You will get the expected result if you use `dense_rank() over(order by zip) % 2 as AlternatingValue` but that is probably just because you have simplified your sample data a bit to much or ...? – Mikael Eriksson Jul 19 '12 at 08:40
  • "You will get the expected result if you use dense_rank() over(order by zip) % 2 as AlternatingValue but that is probably just because you have simplified your sample data a bit to much?" --> Answer: The problem is that the alternating value needs to be the same for the same values in the column BaseForAlternatingValue. And there might be the same value in the column BaseForAlternatingValue over many different zip codes. – spse Jul 19 '12 at 08:47

2 Answers2

0

Try using ROW_NUMBER as a direct replacement for DENSE_RANK. DENSE_RANK will give multiple rows the same value where they tie for a rank - ROW_NUMBER will not.

DENSE_RANK reference ROW_NUMBER reference

EDIT

This is ugly but appears to produce the correct result. The first CTE determines the output order of the rows and calculates the "alternating value".
The second determines the first instance of each BaseForAlternatingValue in the output result set.
The output query returns the rows in the right order with the first "alternating value" for each BaseForAlternatingValue

;WITH cte
AS
(
SELECT BaseForAlternatingValue, zip, 
       ROW_NUMBER() OVER (ORDER BY zip,BaseForAlternatingValue)AS rn,
       DENSE_RANK() OVER (ORDER BY zip,BaseForAlternatingValue) % 2 AS av
FROM [xy.TestTable]
)
,rnCTE
AS
(
SELECT *, 
       ROW_NUMBER() OVER (PARTITION BY BaseForAlternatingValue ORDER BY rn) AS rn2
FROM cte
)
SELECT rn.av AS AlternatingValue, 
       r.BaseForAlternatingValue, r.zip
FROM cte r
JOIN rnCTE rn
ON rn.BaseForAlternatingValue = r.BaseForAlternatingValue
AND rn.rn2 =1
ORDER BY zip, BaseForAlternatingValue
Ed Harper
  • 21,127
  • 4
  • 54
  • 80
  • As I said above, "Duplicates in that column always need the same value", that's why dense_rank is suitable for me. – spse Jul 19 '12 at 08:18
  • Thanks for this Ed, works great, and gives me reason to do more research into CTEs. Marked as answer! – spse Jul 19 '12 at 12:08
  • @spse - your selected solution as described in your edit behaves logically in exactly the same way as this code, except it uses a temp table instead of a CTE. – Ed Harper Jul 20 '12 at 12:48
  • That's right, Ed! I'd say I prefer a temp table from a readibility standpoint (all personal preference though). One thing I haven't done though is measuring the performance. But that part of this app isn't critical anyway. And your answer has triggered me to some CTE reading... ;) – spse Jul 20 '12 at 13:33
0

I know this is irrelevant now, as this question has long-since been solved.

You can do this with a single cte and a join:

with mins as (
    select min(zip) min_zip,
        BaseForAlternatingValue
    from xy.TestTable
    group by BaseForAlternatingValue
)
select dense_rank() over (order by m.min_zip, t.BaseForAlternatingValue) % 2 AlternatingValue,
    t.BaseForAlternatingValue,
    t.zip
from xy.TestTable t
join mins m on m.BaseForAlternatingValue = t.BaseForAlternatingValue
order by t.zip, t.base;

Alternate solution for SQL Server 2012 with a single cte:

with mins as (
    select min(zip) over (partition by BaseForAlternatingValue) min_zip,
        BaseForAlternatingValue,
        zip
    from xy.TestTable
)
select dense_rank() over (order by min_zip, BaseForAlternatingValue) % 2 AlternatingValue,
    BaseForAlternatingValue,
    zip
from mins
order by zip;

The idea is that if you can guarantee that there are never 2 of the same base with different zips, you can dense_rank ordered by zip first and then base. Since your ordering only depends on the minimum zip for each base, you can get that using min() - or in 2012 min() over (partition by) to remove the join.

Arin Taylor
  • 380
  • 1
  • 7