0

I'm working with the SQL version created by data.world which seems pretty generic (probably a version of MySql)
my data looks like this:

ID   |SSM         | Abortion   | Climate     |
1     High          Low           Medium
2     High          High          Lo 
3     Low           High          High
4     Medium        Low           Low 

I want to generate a SQL statement that outputs counts for each column, it would hopefully look like the following:

 Priority |  SSM    | Abortion | Climate |
 High        2           2         1
 Medium      1           0         1
 Low         1           1         2
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
rwreed
  • 348
  • 2
  • 10
  • Tag your question with the database you are using. – Gordon Linoff Aug 19 '18 at 13:15
  • I couldn't find the SQL version they are using in their docs. My sense is its a version of mysql, but I wasn't sure. So I didn't tag mysql. – rwreed Aug 19 '18 at 13:24
  • . . I added MySQL based on your supposition. You can try running `select version() as mysql_version`. If it is MySQL, you will get the MySQL version number. If Postgres, it'll be obvious. I think it will fail in most other databases. – Gordon Linoff Aug 19 '18 at 13:38
  • Consider handling issues of data display in the presentation layer/application-level code, assuming you have that (e.g. a simple PHP loop acting upon an ordered array). – Strawberry Aug 19 '18 at 14:14
  • @Strawberry, you got me thinking that I had conceptualized my problem incorrectly. I needed to convert the wide table to a long one so my presentation application could more easily take it. I've added another answer below that does that. Thanks – rwreed Aug 19 '18 at 17:48

2 Answers2

2

A relatively simple way that works in any database is to use union all:

select 'High' as priority,
       sum(case when SSM = 'High' then 1 else 0 end) as SSM,
       sum(case when Abortion = 'High' then 1 else 0 end) as Abortion,
       sum(case when Climate = 'High' then 1 else 0 end) as climate
from t
union all
select 'Medium' as priority,
       sum(case when SSM = 'Medium' then 1 else 0 end) as SSM,
       sum(case when Abortion = 'Medium' then 1 else 0 end) as Abortion,
       sum(case when Climate = 'Medium' then 1 else 0 end) as climate
from t
union all
select 'Low' as priority,
       sum(case when SSM = 'Low' then 1 else 0 end) as SSM,
       sum(case when Abortion = 'Low' then 1 else 0 end) as Abortion,
       sum(case when Climate = 'Low' then 1 else 0 end) as climate
from t;

In most databases, you can combine this to:

select p.priority,
       sum(case when t.SSM = p.priority then 1 else 0 end) as SSM,
       sum(case when t.Abortion = p.priority then 1 else 0 end) as Abortion,
       sum(case when t.Climate = p.priority then 1 else 0 end) as climate
from (select 'High' as priority, 1 as ord union all
      select 'Medium' as priority, 2 as ord union all
      select 'Low' as priority, 3 as ord
     ) p cross join
     t
group by p.priority, p.ord
order by p.ord;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

I accepted @Gordon Linoffs answer since is the answer to the question. But Strawberry got me thinking that I had conceptualized the problem wrong. What I needed to do was convert a wide dataset into a long, and then do the counting. I got this from Using PIVOT to Flip Data from Wide to Tall

To do this for my example:

SELECT t.condition, t.prioirty, count(t.priority)
FROM(
SELECT "Abortion" as condition, df.abortion
  FROM df
  WHERE df.abortion!=""
UNION ALL
SELECT "SSM" as condition, df.SSM
  FROM df
  WHERE df.SSM!=""
UNION ALL
SELECT "Climate" as condition, df.climate
  FROM df
  WHERE df.climate!="" ) as t
GROUP BY t.condition, t.priority

Maybe there's an easier way to do this, but data.world doesn't support CROSS APPLY

rwreed
  • 348
  • 2
  • 10