4

I am trying to join the data from three SQL tables.

The tables are in the following format:

clients

╔════════╗
║ CLIENT ║
╠════════╣
║ A      ║
║ B      ║
║ C      ║
║ D      ║
╚════════╝

work_times

╔════════╦══════════╦════════╦════════════╗
║ Client ║   Work   ║ Amount ║    Date    ║
╠════════╬══════════╬════════╬════════════╣
║ A      ║ Web Work ║     10 ║ 2013-01-12 ║
║ B      ║ Research ║     20 ║ 2013-01-20 ║
║ A      ║ Web Work ║     15 ║ 2013-01-21 ║
║ C      ║ Research ║     10 ║ 2013-01-28 ║
╚════════╩══════════╩════════╩════════════╝

expenses

╔════════╦══════════╦════════╦════════════╗
║ Client ║   Item   ║ Amount ║    Date    ║
╠════════╬══════════╬════════╬════════════╣
║ A      ║ Software ║     10 ║ 2013-01-12 ║
║ B      ║ Software ║     20 ║ 2013-01-20 ║
╚════════╩══════════╩════════╩════════════╝

I would like a query that returns the Count and Sum of the work and expenses for each client, i.e.:

╔════════╦═══════════╦═══════════╦══════════════╦══════════════╗
║ CLIENT ║ COUNTWORK ║ WORKTOTAL ║ COUNTEXPENSE ║ EXPENSETOTAL ║
╠════════╬═══════════╬═══════════╬══════════════╬══════════════╣
║ A      ║         2 ║        25 ║            1 ║           10 ║
║ B      ║         1 ║        20 ║            1 ║           20 ║
║ C      ║         1 ║        10 ║            0 ║            0 ║
╚════════╩═══════════╩═══════════╩══════════════╩══════════════╝

So far I have the following:

SELECT clients.Client,
 COUNT(distinct work_times.id) AS num_work,
 COUNT(expenses.id) AS num_expenses
FROM
 clients
 INNER JOIN work_times ON work_times.Client = clients.Client
   INNER JOIN expenses ON expenses.Client = work_times.Client
GROUP BY
  clients.Client

Which seems to be along the right lines but which skips clients for which there is no expense and seems to multiply the num_expenses by num_work. I would also like to add a WHERE clause to specify to only return the work times and expenses between two dates. What changes do I need to make to the query to get the desired output?

John Woo
  • 258,903
  • 69
  • 498
  • 492
Nick
  • 4,304
  • 15
  • 69
  • 108

3 Answers3

6

You need to separately calculate the values in a subquery. The purpose of WHERE clause on the outer most query is to filter out records that has record atleast on one table. So in this case, Client D will not be shown on the result list.

SELECT  a.*,
        COALESCE(b.totalCount, 0) AS CountWork,
        COALESCE(b.totalAmount, 0) AS WorkTotal,
        COALESCE(c.totalCount, 0) AS CountExpense,
        COALESCE(c.totalAmount, 0) AS ExpenseTotal
FROM    clients A
        LEFT JOIN
        (
            SELECT  Client, 
                    COUNT(*) totalCount,
                    SUM(Amount) totalAmount
            FROM    work_times
            WHERE   DATE BETWEEN '2013-01-01' AND '2013-02-01'
            GROUP   BY Client
        ) b ON a.Client = b.Client
        LEFT JOIN
        (
            SELECT  Client, 
                    COUNT(*) totalCount,
                    SUM(Amount) totalAmount
            FROM    expenses
            WHERE   DATE BETWEEN '2013-01-01' AND '2013-02-01'
            GROUP   BY Client
        ) c ON a.Client = c.Client
WHERE   b.Client IS NOT NULL OR
        c.Client IS NOT NULL

UPDATE

╔════════╦═══════════╦═══════════╦══════════════╦══════════════╗
║ CLIENT ║ COUNTWORK ║ WORKTOTAL ║ COUNTEXPENSE ║ EXPENSETOTAL ║
╠════════╬═══════════╬═══════════╬══════════════╬══════════════╣
║ A      ║         2 ║        25 ║            1 ║           10 ║
║ B      ║         1 ║        20 ║            1 ║           20 ║
║ C      ║         1 ║        10 ║            0 ║            0 ║
╚════════╩═══════════╩═══════════╩══════════════╩══════════════╝
John Woo
  • 258,903
  • 69
  • 498
  • 492
  • I have to make a slight change to your query to group by Month and then by Client and then order by Month. I have had a go on the fiddle, which you can see [here](http://www.sqlfiddle.com/#!2/fd61b/5), but the output is being ordered by Client rather than Month. Can you suggest anything? – Nick Feb 24 '13 at 16:26
  • is the output on the fiddle correct by the only problem is ordering? – John Woo Feb 24 '13 at 16:33
  • The output is almost correct. With the changes I have made to the dates there is one work time for Client B in January 2013 and one expense for Client B in February 2013, and yet they are getting returned in the same row. Other than this the ordering is a problem. Some months there may be an expense for a client but no work time and visa versa. – Nick Feb 24 '13 at 16:40
  • ok, i'll analyze the problem now. One tip I can give you is to open another question for this so many others can see the problem `:D` – John Woo Feb 24 '13 at 16:45
  • I have asked another question [here](http://stackoverflow.com/questions/15054349/using-group-by-and-order-by-on-an-inner-join-sql-query) – Nick Feb 24 '13 at 17:24
0

You can move the group by to a subquery, so you don't repeat every work_time for every expense. Once you have the subqueries, it's easy to add a date filter to both:

SELECT  clients.Client
,       work_times.cnt AS num_work
,       work_times.total AS total_work
,       expenses.cnt AS num_expenses
,       expenses.total AS total_expenses
FROM    clients
LEFT JOIN
        (
        SELECT  Client
        ,       COUNT(DISTINCT id) as cnt
        ,       SUM(Amount) as total
        FROM    work_times
        WHERE   Date between '2013-01-01' and '2013-02-01'
        GROUP BY
                Client
        ) work_times
ON      work_times.Client = clients.Client
LEFT JOIN
        (
        SELECT  Client
        ,       COUNT(DISTINCT id) as cnt
        ,       SUM(Amount) as total
        FROM    expenses
        WHERE   Date between '2013-01-01' and '2013-02-01'
        GROUP BY
                Client
        ) expenses
ON      expenses.Client = clients.Client
Andomar
  • 232,371
  • 49
  • 380
  • 404
0

I don't have a proper instance to test here, but I would probably start so and then chek if I can further improve the query...

select 
  T1.client, 
  ce AS 'Count Work', 
  am AS 'Work Total', 
  ci AS 'Count Expense', 
  am2 AS 'Expense Total' 
from (
  select 
    client, 
    count (work) as ce, 
    sum(amount) as am 
  FROM 
    clients 
      left join work_times 
      on fk_client=client 
  group by 
    fk_client
) T1 
left join (
  select 
    client, 
    count(item) as ci, 
    sum(amount) as am2 
  from 
    clients 
      left join expenses 
      on fk_client=client 
  group by fk_client
) T2 
where T1.client=T2.client;

Maybe this looks quite complicated, but it makes sure that you only have one row for every client. Maybe it's even more readable later...

TomS
  • 467
  • 9
  • 25