8

I'm looking to 'flatten' my dataset in order to facilitate data mining. Each categorical column should be changed to multiple Boolean columns. I have a column with categorical values, e.g.:

 ID    col1
  1     A
  2     B
  3     A

I'm looking for a way to pivot this table, and have an aggregated function telling me whether this ID has value A or B:

Result:

 ID    col1A    col1B
  1     1        0
  2     0        1
  3     1        0

I tried using PIVOT but have no idea which aggregated function to use inside it.

Also looked for answers in SF but couldn't find any...

I'm using MS-SQL 2012.

Any help would be appreciated! Omri

EDIT:

The number of categories in col1 is unknown, therefore the solution must be dynamic. Thanks :)

Omri374
  • 2,555
  • 3
  • 26
  • 40

1 Answers1

12

try this:

select ID,
         col1A=(case when col1='A' then 1 else 0 end),
         col1B=(case when col1='B' then 1 else 0 end)
  from <table>    


IF you have one ID with both A and B and you want to have distinct ID in the output you could do

 select ID,
         col1A=max(case when col1='A' then 1 else 0 end),
         col1B=max(case when col1='B' then 1 else 0 end)
  from <table> 
  group by id

EDIT

As per your comment, If you do not know the number of options for col1, then you can go for dynamic PIVOT

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(col1) 
                    from <table> 
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT id, ' + @cols + ' from <table> 

            pivot 
            (
                count([col1])
                for col1 in (' + @cols + ')
            ) p '
print(@query)
execute(@query)


SQL Fiddle Demo

Joe G Joseph
  • 23,518
  • 5
  • 56
  • 58
  • This option should atleast cover the possibility that one ID can have A *and* B, and therefore have a `GROUP BY`, and probably use `MAX()`. – MatBailie Aug 21 '12 at 05:57
  • Thanks! I just have to think of a way to do it dynamically, since I don't know the number of options I have for col1. Is there any way other than a cursor? – Omri374 Aug 21 '12 at 06:07
  • 1
    @Omri374: I have updated my query for dynamic pivot., plz check now – Joe G Joseph Aug 21 '12 at 06:20