8

I have a question about how to pivot the table in MySQL. I have a dataset, columns like this:

ID   Name     job_title
1    Sam       Fireman
2    Tomas     Driver
3    Peter     Fireman
4    Lisa      Analyst
5    Marcus    Postman
6    Stephan   Analyst
7    Mary      Research Manager
8    Albert    Analyst
9    Chen      Driver
...etc...

And I want to generate a table like this:

Fireman  Driver   Analyst  Postman   Research Manager ...
Sam     Tomas     Lisa     Marcus     Mary
Peter   Chen      Stephan  (someone)  (someone)...
....etc...

Since, this is just a sample from the datasets, so I may not know how much different job titles in the dataset. The goal is to list every person in the different job title columns.

Is there any methods to do that? Or is it possible to generate such table in MySQL? An engineer told me that it can done by creating a view, but I do not know how. I read some books, and still confused.

Any ideas and SQL queries guides are welcome!

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Xiaoxi Chen
  • 133
  • 1
  • 2
  • 8
  • 1
    The desired output data you posted is *not* an example of pivoted data because values in the same row have no relationship with each other (e.g. Peter has nothing to do with Chen). – Dai Feb 19 '18 at 00:12
  • @Dai Thanks for your comments. I agree with you, but how can I generate the table which I listed on the post. Just for discussion. Can we do that by using create a view statement? I really doubt that. – Xiaoxi Chen Feb 19 '18 at 00:30
  • 1
    It is possible if you know the values of all the different `job_title` values first, otherwise you will have to use Dynamic-SQL, which you cannot use in a `VIEW`. – Dai Feb 19 '18 at 00:32
  • Compacting rows is possible if your database supports `ROW_NUMBER()` and `FULL OUTER JOIN`, however MySQL currently supports neither, which makes it much more difficult. The unreleated MySQL 8 does add `ROW_NUMBER` though. – Dai Feb 19 '18 at 00:40
  • So you are asking about making each table column like `job_title` and `Name` to each be a row of values instead, correct? You want the output to be all values in a specific column to be transformed into a row, correct? Where you say "*And I want to generate a table like this:*" it appears you have row 1 example being values of the column named `job_title`, right? – Bitcoin Murderous Maniac Feb 19 '18 at 02:25
  • @ Bitcoin Murderous Maniac Yeah, you got what I mean. – Xiaoxi Chen Feb 19 '18 at 05:31

6 Answers6

3

There are 3 things to think about 1) How to dynamically generate a bunch of max(case when 2) assigning something to group the case when's by - in this case I generate a row number using a variable 3) some of your job titles contain white space which I remove for the generation of column headers

set @sql = 
            (select concat('select ', gc,            ' from 
             (select name,job_title,
                if (job_title <> @p, @rn:=1 ,@rn:=@rn+1) rn,
                @p:=job_title p
                from t
                cross join (select @rn:=0,@p:=null) r
                order by job_title
              ) s group by rn;') from
            (select 
                group_concat('max(case when job_title = ', char(39),job_title ,char(39),' then name else char(32) end ) as ',replace(job_title,char(32),'')) gc
                from
                (
                select distinct job_title from t
                ) s
                ) t
             )
;           

Generates this sql code

select max(case when job_title = 'Fireman' then name else char(32) end ) as Fireman,
        max(case when job_title = 'Driver' then name else char(32) end ) as Driver,
        max(case when job_title = 'Analyst' then name else char(32) end ) as Analyst,
        max(case when job_title = 'Postman' then name else char(32) end ) as Postman,
        max(case when job_title = 'Research Manager' then name else char(32) end ) as ResearchManager
         from 
             (select name,job_title,
                if (job_title <> @p, @rn:=1 ,@rn:=@rn+1) rn,
                @p:=job_title p
                from t
                cross join (select @rn:=0,@p:=null) r
                order by job_title
              ) s group by rn;

Which can be submitted to dynamic sql

prepare sqlstmt from @sql;
execute sqlstmt;
deallocate prepare sqlstmt;

result

+---------+--------+---------+---------+-----------------+
| Fireman | Driver | Analyst | Postman | ResearchManager |
+---------+--------+---------+---------+-----------------+
| Sam     | Tomas  | Lisa    | Marcus  | Mary            |
| Peter   | Chen   | Stephan |         |                 |
|         |        | Albert  |         |                 |
+---------+--------+---------+---------+-----------------+
3 rows in set (0.00 sec)
P.Salmon
  • 17,104
  • 2
  • 12
  • 19
1

I also encountered this problem on HackerRank. While I think the group_concat answer is very good and is typically used in these sorts of pivot situations with earlier versions of MySql, I find that concat and group_concat can be difficult to read and understand.

If your version of MySql supports window functions then you can solve this using temporary tables, as MySql does not support outer joins. You'll need a separate temp table for every pivot column to avoid Window function is not allowed in window specification errors:

use test;
drop table if exists occupations;
create table if not exists occupations  (
    name varchar(50)
    ,occupation varchar(50)
);
insert into occupations (name, occupation) select 'Samantha', 'Doctor'
    union all select 'Julia', 'Actor'
    union all select 'Maria', 'Actor'
    union all select 'Meera', 'Singer'
    union all select 'Ashley', 'Professor'
    union all select 'Kelly', 'Professor'
    union all select 'Christeen', 'Professor'
;
-- the way to approach this in mysql is to create a temp table with ordinals.
-- then upsert with four queries using row_number()
-- nb full join not supported. let's try temp table
drop table if exists doctors;
create temporary table doctors
(
    name varchar(50)
    ,occupation varchar(50)
    ,ordinal int
);
insert into doctors
    select 
        name
        ,occupation
        ,row_number() over (partition by occupation order by name) as ordinal
    from occupations
    where occupation = 'Doctor'
;
drop table if exists actors;
create temporary table actors
(
    name varchar(50)
    ,occupation varchar(50)
    ,ordinal int
);
insert into actors
    select 
        name
        ,occupation
        ,row_number() over (partition by occupation order by name) as ordinal
    from occupations
    where occupation = 'Actor'
;
drop table if exists professors;
create temporary table professors
(
    name varchar(50)
    ,occupation varchar(50)
    ,ordinal int
);
insert into professors
    select 
        name
        ,occupation
        ,row_number() over (partition by occupation order by name) as ordinal
    from occupations
    where occupation = 'Professor'
;
drop table if exists singers;
create temporary table singers
(
    name varchar(50)
    ,occupation varchar(50)
    ,ordinal int
);
insert into singers
    select 
        name
        ,occupation
        ,row_number() over (partition by occupation order by name) as ordinal
    from occupations
    where occupation = 'Singer'
;

-- upsert: update if not exists
drop table if exists results;
create temporary table results
(
    singer varchar(50)
    ,actor varchar(50)
    ,doctor varchar(50)
    ,professor varchar(50)
    ,ordinal int primary key
);
insert into results (singer, ordinal) 
    select name, ordinal from singers
on duplicate key update singer = name
;
insert into results (actor, ordinal) 
    select name, ordinal from actors
on duplicate key update actor = name
;
insert into results (doctor, ordinal) 
    select name, ordinal from doctors
on duplicate key update doctor = name
;
insert into results (professor, ordinal) 
    select name, ordinal from professors
on duplicate key update professor = name
;
select singer, actor, doctor, professor from results;

Ps. I have to disagree with earlier comments: this is a pivot. We are projecting rows into columns, with the rows being a projection of occupations and ordinals.

j_d_b
  • 638
  • 5
  • 11
-1

The desired output data you posted is not an example of pivoted data because values in the same row have no relationship with each other, it sounds like you just want a compact representation of everyone on a per-cell basis. This makes it a view-level concern and should not be performed in SQL, but should be performed in your view-level (presumably a PHP web-page, as you're using MySQL).

Your output data is column-oriented, not row-oriented, but HTML tables (and most datagrid components for other platforms like WinForms, Java and WPF) are row-oriented, so you'll need to think about how to do it.

Assuming you're targeting HTML and taking into account the row-oriented vs. column-oriented conversion required, try this (pseudo-code)

define type DBResultRow {
    id: int,
    name: string,
    job_title: string
}

let rows : List<DBResultRow> = // get rows from your view, no changes to SQL required

let jobTitles : List<String>
let jobTitleMap : Map<String,Int32>
let outputTable : List<List<String>>

foreach( person: DBResultRow in rows )
{
    let columnIdx = jobTitleMap[ person.job_title ];
    if( !columnIdx )
    {
        jobTitles.Add( person.job_title );
        columnIdx = jobTitles.Count - 1;
        jobTitleMap[ person.job_title, columnIdx ];
    }

    outputTable[ columnIdx ].Add( person.name );
}

let longestColumnLength = outputTable.Select( col => col.Count ).Max();

Then render to HTML:

<table>
    <thead>
        <tr>
foreach( jobTitle: String in jobTitles )
{
            <th><%= jobTitle #></th>
}
        </tr>
    </thead>
    <tbody>
for( row = 0; row < longestColumnLength; row++ )
{
        <tr>
    for( col = 0; col < jobTitles.Count; col++ )
    {
        if( row > outputTable[ col ].Count )
        {
            <td></td>
        }
        else
        {
            <td><%= outputTable[ col ][ row ] %></td>
        }
    }
        </tr>
}
    </tbody>
</table>
Dai
  • 141,631
  • 28
  • 261
  • 374
  • Well values in the same row *could* reflect the occurrence of a given result within the set. Anyway, I agree that the best solution would be to employ some kind of application level code – Strawberry Feb 19 '18 at 01:15
-1

Look at JSON services (JSON_OBJECTAGG,JSON_OBJECT), it can be parsed in java with basin object mapping (Jackson).

select xyz, JSON_OBJECTAGG( a, b) as pivit_point
from ... group by xyz;
B--rian
  • 5,578
  • 10
  • 38
  • 89
-1

It's better to start from the result and try to map to the original table. Basically each row of the result table should be in the same group in the original table. And the CTE table grouping and rank window function creates the group order by name.

with grouping as (
select
Name,
job_title,
rank() over (partition by job_title order by name) as rnk
from jobs
)

select
group_concat(if(g.job_title = 'Fireman', g.Name, NULL)) as 'Fireman',
group_concat(if(g.job_title = 'Driver',g.Name, NULL)) as 'Driver',
group_concat(if(g.job_title = 'Analyst', g.Name, NULL)) as 'Analyst',
group_concat(if(g.job_title = 'Research Manager', g.Name, NULL)) as 'Research Manager'
from grouping g
group by g.rnk
order by g.rnk
Qin Pu
  • 1
-1

If you want to easily creat pivot tables (also known as cross tabulation) from any MySQL database where raws of any table are converted to dynamic columns, I recommend Smart Pivot table. This tool uses a very an easy to use wizard-style interface to generate pivot tables which you later can export to MS Excel.

Smart Pivot table is a dynamic PHP reporting tool which means it automatically updates your pivot tables whenever your database is updated.

Generated Pivot table from MySQL DB