0

For simplicity and clarity, I have the following two SQL table structures:

tbl1: col1, col2, col3

tbl2: col1, col2, col3

Both tables have the same number of columns, and the column IDs in both tables are the same. Also, for both tables, all values in col1 are the same. Both tables have approximately 2000 entries, all of which are integers, and both tables typically have the same number of entries.

Now, I have a third table that has the same structure as the first two, that I am trying to populate with the sums of every value in each column from the first two tables as follows:

tbl3: col1, (tbl1.col2+tbl2.col2), (tbl1.col3+tbl2.col3)

Eventually, I will also have a fourth table that will have the average of every value in the first two tables as follows:

tbl3: col1, (tbl1.col2+tbl2.col2)/2, (tbl1.col3+tbl2.col3)/2

I have been researching how to do this using subqueries and union operations such as those described by @Jonysuise here but when I do this I am seeing incorrect values inserted into tbl3, no values at all, or PDO syntax errors.

Also, I am executing this via PDO due to how this data needs to be handled upstream, and as I am less familiar with this structure I have started wondering if there exists some different recommended way of doing this that I am just not getting.

The last version of this that did not throw errors was:

$db->exec("INSERT INTO tbl3 (col1, col2, col3) select col1, sum(col2), sum(col3) as total
from
(
    select col1,col2,col3
    from tbl1
    union all
    select col1,col2,col3
    from tbl2
) t
group by col1");

I am trying to find a more efficient way to do this especially one that can accommodate other mathematical operations such as averages.

EDIT: Just to clarify, when I run the script above, if there are no errors, I typically get a value in tb3 that is greater than expected.

Community
  • 1
  • 1
ChiralCenter
  • 71
  • 1
  • 9
  • Your query is fine for what you want to do. However, if all values in `col1` are the same, why bother grouping by that? (This won't affect performance, however.) – Gordon Linoff Jan 03 '16 at 00:47
  • This is a very fair point. I do not necesarrily need to be grouping that way. My bigger issue was that the query above is resulting in a value larger than expected. Sorry for the confusion. See edits above. – ChiralCenter Jan 03 '16 at 00:59
  • Also I am still not sure about ways that it can be modified to provide an average rather than a sum. I have tried using AVG but this is throwing PDO syntax errors. – ChiralCenter Jan 03 '16 at 01:01
  • I think you should be able to replace the "SUM(col)" with "AVG(col)" to get the averages.. Could you post the errors that you get when you use AVG? Also what database are you using? – vmachan Jan 03 '16 at 02:43

2 Answers2

1

As you laid out, you do not need an aggregate, union, or derived table query. Simply join the two tables in the select part of append query and run your unit level operations.

Below yields the same as your above query and avoids a union subquery and aggregation. Additionally, you can still achieve other aggregates at the unit level:

# SUM
$db->exec("INSERT INTO tbl3 (col1, col2, col3) 
           SELECT tbl1.col1, tbl1.col2 + tbl2.col2, tbl1.col3 + tbl2.col3
           FROM tbl1 
           INNER JOIN tbl2 
           ON tbl1.col1 = tbl2.col1;");

# AVERAGE (ΣXi)/N ; MEDIAN 
$db->exec("INSERT INTO tbl4 (col1, col2, col3) 
           SELECT tbl1.col1, (tbl1.col2 + t2.col2)/2, (tbl1.col3 + t2.col3)/2
           FROM tbl1 
           INNER JOIN tbl2 
           ON tbl1.col1 = tbl2.col1;");

# MAX; MIN
$db->exec("INSERT INTO tbl5 (col1, col2, col3, col4, col5) 
           SELECT tbl1.col1,
                  CASE WHEN tbl1.col2 > t2.col2 THEN tbl1.col2
                       ELSE tbl2.col2 END As MaxCol2, 
                  CASE WHEN tbl1.col3 > t2.col3 THEN tbl1.col3
                       ELSE tbl2.col3 END As MaxCol3, 
                  CASE WHEN tbl1.col2 < t2.col2 THEN tbl1.col2
                       ELSE tbl2.col2 END As MinCol2, 
                  CASE WHEN tbl1.col3 < t2.col3 THEN tbl1.col3
                       ELSE tbl2.col3 END As MinCol3, 
           FROM tbl1 
           INNER JOIN tbl2 
           ON tbl1.col1 = tbl2.col1;");

# POPULATION VARIANCE: σ^2 = Σ(Xi-μ)^2/N 
$db->exec("INSERT INTO tbl6 (col1, col2, col3) 
           SELECT tbl1.col1,
                  POWER(((t1.col2 - ((t1.col2 + t2.col2)/2)) +
                         (t2.col2 - ((t1.col2 + t2.col2)/2))),2) / 2,

                  POWER(((t1.col3 - ((t1.col3 + t2.col3)/2)) +
                         (t2.col3 - ((t1.col3 + t2.col3)/2))),2) / 2
           FROM tbl1 
           INNER JOIN tbl2 
           ON tbl1.col1 = tbl2.col1;");

# POPULATION STANDARD DEVIATION: σ = sqrt[Σ(Xi-μ)^2/N]
$db->exec("INSERT INTO tbl7 (col1, col2, col3) 
           SELECT tbl1.col1,
             POWER(
                   POWER(((t1.col2 - ((t1.col2 + t2.col2)/2)) +
                          (t2.col2 - ((t1.col2 + t2.col2)/2))),2) / 2
                   ,(1/2)),

             POWER(
                   POWER(((t1.col3 - ((t1.col3 + t2.col3)/2)) +
                          (t2.col3 - ((t1.col3 + t2.col3)/2))),2) / 2
                   ,(1/2))
           FROM tbl1 
           INNER JOIN tbl2 
           ON tbl1.col1 = tbl2.col1;");
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • This is exactly what I was trying to clarify! Thanks for the multiple examples. All of these work perfectly! I think my issue was that I for some reason was looking for completely different query structure for every type of operation. Thanks again! – ChiralCenter Jan 03 '16 at 15:11
0

The query you mentioned is working fine. See the SQL fiddle http://sqlfiddle.com/#!9/96f90/6

select queries are as follows ,

select col1, sum(col2) as total2, sum(col3) as total3
from
(
    select col1,col2,col3
    from tbl1
    union all
    select col1,col2,col3
    from tbl2
) t
group by col1;

select col1, avg(col2) as avg2, avg(col3) as avg3
from
(
    select col1,col2,col3
    from tbl1
    union all
    select col1,col2,col3
    from tbl2
) t
group by col1;

Try insert from these queries, it will works

INSERT INTO tbl3 (col1, col2, col3) ...

please check error logs to identify the errors

Mohammed Safeer
  • 20,751
  • 8
  • 75
  • 78