68

I've a database table commits with the following columns:

id | author_name | author_email | author_date (timestamp) | total_lines

Sample contents are:

1 | abc | abc@xyz.com | 2013-03-24 15:32:49 | 1234
2 | abc | abc@xyz.com | 2013-03-27 15:32:49 | 534
3 | abc | abc@xyz.com | 2014-05-24 15:32:49 | 2344
4 | abc | abc@xyz.com | 2014-05-28 15:32:49 | 7623

I want to get a result as follows:

id | name | week | commits
1  | abc  | 1    | 2
2  | abc  | 2    | 0

I searched online for similar solutions but couldnt get any helpful ones.

I tried this query:

SELECT      date_part('week', author_date::date) AS weekly,
        COUNT(author_email)           
FROM        commits
GROUP BY    weekly
ORDER BY weekly

But its not the right result.

user6058071
  • 757
  • 2
  • 6
  • 7

2 Answers2

121

If you have multiple years, you should take the year into account as well. One way is:

SELECT date_part('year', author_date::date) as year,
       date_part('week', author_date::date) AS weekly,
       COUNT(author_email)           
FROM commits
GROUP BY year, weekly
ORDER BY year, weekly;

A more natural way to write this uses date_trunc():

SELECT date_trunc('week', author_date::date) AS weekly,
       COUNT(author_email)           
FROM commits
GROUP BY weekly
ORDER BY weekly;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 4
    @GordonLinoff : How can we include zero commit weeks also in this query? I tested this one and it gives all rows with at least one commit. – Jilson Thomas Mar 23 '16 at 00:30
  • @JilsonThomas . . . If you have a question, it is better to ask as a question rather than in a comment. – Gordon Linoff Mar 23 '16 at 02:26
  • 23
    @GordonLinoff Grouping by `date_part('year', x), date_part('week', x)` is most likely not what the user wants and should be considered incorrect. For example the first day of 2017 is still part of week 52 that started in 2016 and therefore the `(2017, 52)` pair is probably not at all what you want as it sorts to the end of 2017. Using ´date_trunc('week', x)` on the other hand works as one would expect. – Henry Heikkinen May 02 '17 at 12:25
  • @GordonLinoff I am getting this error - django.db.utils.ProgrammingError: function date_trunc(unknown, unknown) is not unique and HINT: Could not choose a best candidate function. You might need to add explicit type casts. I am using Django and PostgreSQL. – Chopra Oct 21 '18 at 05:50
  • @Chopra . . . If you have a question, you should ask it as a question with sample data, desired results, an appropriate explanation and database tag, not in a comment. – Gordon Linoff Oct 21 '18 at 13:06
  • 2
    Note: `date_part` formats the week as week number (probably more desirable), whereas `date_trunc` gives a date. – Tom Söderlund Nov 19 '18 at 14:06
  • @GordonLinoff I would suggest the column name `first_day_of_the_week` instead of `weekly` for `date_trunc('week', author_date::date)` – Paul Verschoor Nov 17 '22 at 06:27
25

If you want the count of all the intermediate weeks as well where there are no commits/records, you can get it by providing a start_date and end_date to generate_series() function

SELECT t1.year_week week, 
       t2.commit_count 
FROM   (SELECT week, 
               To_char(week, 'IYYY-IW') year_week 
        FROM   generate_series('2020-02-01 06:06:51.25+00'::DATE, 
               '2020-04-05 12:12:33.25+00':: 
               DATE, '1 week'::interval) AS week) t1 
       LEFT OUTER JOIN (SELECT To_char(author_date, 'IYYY-IW') year_week, 
                               COUNT(author_email)             commit_count 
                        FROM   commits 
                        GROUP  BY year_week) t2 
                    ON t1.year_week = t2.year_week; 

The output will be:

     week | commit_count  
----------+-------------
2020-05   | 2
2020-06   | NULL  
2020-07   | 1 
saintlyzero
  • 1,632
  • 2
  • 18
  • 26
  • Worth noting you can `COALESCE(commit_count,0)`, which usually presents the analysis more cleanly and makes numeric calculations easier – StonedTensor Mar 02 '23 at 11:58