0

pic 1

hi, How can i group data based on appearance, into pre set buckets in MYSQL? lets say my appearances are like in picture 1, and i need the output as seen in picture 2? (the ranges values are pre set) thanks in advance

,pic 2

pic 3

Itay Av
  • 69
  • 3
  • 10

1 Answers1

0

You can solve this via a second table which defines the ranges and then use a join on a GROUP BY query. The steps will be as follow:

  1. Create a new table with the ranges
  2. Develop a query for counting and grouping the original data
  3. Run a JOIN query against the new ranges table to determine in which bucket the count will be placed.

Creating and filling the new table for the ranges will be easy (try to avoid the name "range", as it is a keyword):

CREATE TABLE sections
(
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(50),
    start INT,
    end INT
);

Fill it with your sections :

INSERT INTO sections(name, start, end) VALUES ('0-2', 0, 2);
[...]
INSERT INTO sections(name, start, end) VALUES ('11+', 11, 1<<31);

Create a query to group and count your data. This example use a table called 'events' with the columns 'catId' and 'timestamp':

SELECT
    COUNT(*) as eventCount
FROM
    events
GROUP BY
    catId

This might create a result like this (there are only 5 different cat ids used):

+------------+
| eventCount |
+------------+
|          3 |
|          3 |
|          5 |
|          1 |
|          1 |
+------------+

Now you need to JOIN this result to the sections table by using the ON condition. For teaching purposes an intermediate (but not working) query will look like this:

SELECT
    s.id,
    s.name,
    s.start,
    s.end,
    COUNT(*) AS sectionCount
FROM
    sections s
JOIN
    (__your_previous_result_here__) e
ON
    e.eventCount BETWEEN s.start AND s.end
GROUP BY
    s.id

Your total query will look like this:

SELECT
    s.id,
    s.name,
    s.start,
    s.end,
    COUNT(*) AS sectionCount
FROM
    sections s
JOIN
    (SELECT
         COUNT(*) as eventCount
     FROM
         events
     GROUP BY
         catId) e
ON
    e.eventCount BETWEEN s.start AND s.end
GROUP BY
    s.id

Please notice the aliases s and e being used here. Depending on the input data the result will look like this:

+----+------+-------+------------+--------------+
| id | name | start | end        | sectionCount |
+----+------+-------+------------+--------------+
|  1 | 0-1  |     0 |          1 |            2 |
|  2 | 2-4  |     2 |          4 |            2 |
|  3 | 5++  |     5 | 2147483647 |            1 |
+----+------+-------+------------+--------------+
Progman
  • 16,827
  • 6
  • 33
  • 48
  • hi Progman, thanks so much for your help! I have two questions about the solution you wrote here: 1) As you can see in "pic 2" , one of the pre set ranges in my example was the range "7-10", and since that there was no sutiable "event count" for this range in 'events' table, We do not see a relevant line for this range, What should I change in the code so that a relevant row with the value 0 for the "Secion Count" field appears? – Itay Av Jan 13 '18 at 11:43
  • 2) my final output display sould need to include the % of the total section Counts, so if you look in my exapmle, there were only 6 appearance in total, 1 that was in the 0-2 range, 4 in the 3-6 range, 0 in the 7-10 range, and 1 in the 11+ range, the needed display should by like in pic3 that i have added to my post just now. thanks again for your much appreciated help! – Itay Av Jan 13 '18 at 11:43
  • @ItayAv For 1) You can use a `LEFT JOIN` (or `RIGHT JOIN`) to get a link between the two tables (table `sections` and the previous result set) even when there are no corresponding link between them. It will create a sum of `0` for these rows. Check the MySQL manual about how a `LEFT JOIN` works different to a normal `JOIN` query. – Progman Jan 13 '18 at 11:54
  • @ItayAv For 2) Check other questions like https://stackoverflow.com/questions/3061655/getting-percentage-of-count-to-the-number-of-all-items-in-group-by or https://stackoverflow.com/questions/39135196/calculate-percentage-in-sql-by-group or https://stackoverflow.com/questions/14516023/how-to-get-a-percentage-of-total-when-the-query-has-a-group-by or https://stackoverflow.com/questions/8768625/calculation-of-percentage-of-group-count or any other question. However it might be easier to calculate the percentage in your programming language where you read the data. – Progman Jan 13 '18 at 11:55
  • Thank you very much for your assistent! – Itay Av Jan 16 '18 at 07:23