How to create dynamic dummy variables in SQL

Question

I have a simple table with 2 columns: ID (integer) and Category (string), and each ID can repeat with a few categories, like so:

ID    Cat
---   ---
1     A
1     B
2     B
3     A
3     B
3     C

I want to reshape this table so that each unique Category would be a dummy variable (0/1 if ID has it):

ID    A   B   C
---   --  --  --
1     1   1   0
2     0   1   0
3     1   1   1

Now, if the set of unique categories is known (and small) this is an easy CASE WHEN statement x no. of unique categories.

My questions are:

a) What if it isn't unknown or really large? How do I create this 'CASE WHEN' effect automatically?

b) More importantly: I'm not necessarily interested in all categories (say only dummies for 'A' and 'B') but only categories which I have in a separate table called Cats, which is a simple 1 column holding my relevant categories (again, unknown how many), like:

 Cat
 ---
 A
 B

How do I create dummy variables for only the categories in this dynamic table?

Do you think all of this should really be done in other tools e.g. R?

Thanks!

(I'm using Teradata SQL with SQLA, but I think it's a general SQL question)

score 2 · Answer 1 · answered Sep 30 '14 at 14:23

2

Just use table:

table(dat)
   Cat
ID  A B C
  1 1 1 0
  2 0 1 0
  3 1 1 1

and in case you want to have the binary table for a group of Cat:

table(subset(dat,Cat %in% c('A','B')))
   Cat
ID  A B
  1 1 1
  2 0 1
  3 1 1

answered Sep 30 '14 at 14:23

agstudy

119,832
17
199
261

@RicardoSaporta, didn't you see my deleted answer? – David Arenburg Sep 30 '14 at 15:03
@agstudy Thanks, but I was asking whether this could be done in SQL. – Giora Simchoni Oct 01 '14 at 06:12

How to create dynamic dummy variables in SQL

1 Answers1