3

The title is horrible but that's the best I could do. What I have is something like this:

Country  Tag    Weight
-----------------------
1        1      20
1        2      30
1        3      77
2        1      10
2        2      11
2        3      100

Or, in a human readable form:

Country  Tag    Weight
-----------------------
USA      Rock   20
USA      Pop    30
USA      Metal  77
Spain    Rock   10
Spain    Pop    11
Spain    Metal  100

Using either SQL (to create a view) or Data Manipulation Tools in Knime, I need to manipulate the data into this form:

Country  Rock   Pop   Metal
----------------------------
USA      20     30    77
Spain    10     11    100

Essentially, Tag entries (unique ones) become the columns and countries (unique ones) become the row id's with the weight values sticking with their country/tag.

I've tried everything I can think of in Knime and no raw SQL query springs to mind. In Knime, I successfully created the structure of the matrix I want (Country x Tag), but I have no idea how to populate the actual Weight values, they're all question marks. My working solution is to simply output the data into a CSV file in the form I want instead of into the database. But that's klunky and annoying to keep in sync. Any ideas?

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
colithium
  • 10,269
  • 5
  • 42
  • 57

3 Answers3

3

You're looking for a pivot or cross table. I'm not much into Knime but that's the technique you want to be googling. if Knime doesn't have that function available, you could do worse than to drop that CSV data into MS Excel and pivot it.

LesterDove
  • 3,014
  • 1
  • 23
  • 24
  • 1
    Whenever I've read about pivot tables, it always heavily focused on aggregation. I didn't realize that they could solve a problem like this. The SUM of one element is simply that element... Duh – colithium Nov 03 '10 at 04:19
2

Using ANSI SQL, you'd use:

  SELECT t.country,
         MAX(CASE WHEN t.tag = 'Rock' THEN t.weight END) AS Rock,
         MAX(CASE WHEN t.tag = 'Pop' THEN t.weight END) AS Pop,
         MAX(CASE WHEN t.tag = 'Metal' THEN t.weight END) AS Metal
    FROM YOUR_TABLE t
GROUP BY t.country

LesterDove is correct - you're after a pivot query, to convert row data to columnar data.

PIVOT (and UNPIVOT) are ANSI syntax, but support lags a bit -- SQL Server 2005+, Oracle 11g+ are the only ones I'm aware of. This is the next most ANSI/widely support approach.

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
  • Unfortunately I'm using MySQL. And I'd have to generate the query automatically since there are ~250 countries and 200 tags. Luckily Knime has good pivoting support, I just didn't realize it applied to this situation. – colithium Nov 03 '10 at 04:42
  • @OMG Pnies: Can you also post the T-SQL Syntax? And I guess some people might be interested in Oracle too.. – PPC Oct 09 '12 at 19:13
  • @PPC: The above works on SQL Server, Oracle and PostgreSQL without alteration. Basically, anything that supports the `CASE` syntax. I've updated the question to have the pivot tag -- anyone can look for a combination with their DB of choice, support for the PIVOT keyword is relatively recent. – OMG Ponies Oct 09 '12 at 23:24
2

You can use the Pivoting node in KNIME to perform the trick. Simply select the Tag column as Pivot column, the Country column as Group column and in the Aggregation section the Weight column and as aggregation method sum. Bye, Tobias

Tobias
  • 21
  • 1