0

I'm having some hard time understanding what I'm doing wrong. The result of this query shows the same results for each row instead of being updated by the right result.

My DATA

I'm trying to update a table of stats over a set of business

business_stats ( id SERIAL,
                 pk integer not null,
                 b_total integer,
                 PRIMARY KEY(pk)
                );

the details of each business are stored here

business_details (id SERIAL,
                  category CHARACTER VARYING,
                  feature_a CHARACTER VARYING,
                  feature_b CHARACTER VARYING,
                  feature_c CHARACTER VARYING
                  );

and here a table that associate the pk with the category

datasets (id SERIAL,
          pk integer not null,
          category CHARACTER VARYING;
          PRIMARY KEY(pk)
          );

WHAT I DID (wrong)

UPDATE business_stats
SET b_total = agg.total
FROM business_stats b,
     (  SELECT  d.pk, count(bd.id) total
        FROM business_details AS bd
            INNER JOIN datasets AS d
            ON bd.category = d.category
        GROUP BY d.pk
     ) agg
WHERE b.pk = agg.pk;

The result of this query is

 | id | pk |  b_total  |
 +----+----+-----------+
 |  1 | 14 |  273611   |
 |  2 | 15 |  273611   |
 |  3 | 16 |  273611   |
 |  4 | 17 |  273611   |

but if I run just the SELECT the results of each pk are completely different

 | pk |  agg.total  |
 +----+-------------+
 | 14 |    273611   |
 | 15 |    407802   |
 | 16 |    179996   |
 | 17 |    815580   |

THE QUESTION

  • why is this happening?
  • why is the WHERE clause not working?

Before writing this question I've used as reference these posts: a, b, c

Community
  • 1
  • 1
entalpia
  • 101
  • 1
  • 11
  • Does `business_details` and `datasets` have the same distinct categories? – Robert Seaman Apr 17 '17 at 17:03
  • 1
    Add some foreign key , at least for readability. BTW: Why do your tables have both a (serial) id and an int field defined as Primary Key? And why doesn't `business_details` have a primary key? – wildplasser Apr 17 '17 at 17:05
  • yes, in my case the categories are A, B, C, and D where in table `datasets' *pk* 14 and *category* is A, *pk* 15 and *category* is B and so on. – entalpia Apr 17 '17 at 17:06
  • 1
    The WHERE clause is working, but there is an implicit self-join with `business_stats`. Try removing `business_stats b,` from FROM clause and changing the WHERE to `business_stats.pk = agg.pk`. But this isn't tested as I could not be bothered to generate data for test tables. – Dan Getz Apr 17 '17 at 17:13
  • @wildpasser I know, when I started working on this project I didn't have any clue on how to deal with tables. Now I'm trying to renew all old tables including a pk where is needed. – entalpia Apr 17 '17 at 17:15

2 Answers2

2

Do the following (I always recommend against joins in Updates)

UPDATE business_stats bs
SET b_total =
(  SELECT   count(c.id) total
        FROM business_details AS bd
        INNER JOIN datasets AS d
        ON bd.category = d.category
       where d.pk=bs.pk
 )
/*optional*/
where exists (SELECT  *
    FROM business_details AS bd
        INNER JOIN datasets AS d
        ON bd.category = d.category
   where d.pk=bs.pk)
Joe Love
  • 5,594
  • 2
  • 20
  • 32
1

The issue is your FROM clause. The repeated reference to business_stats means you aren't restricting the join like you expect to. You're joining agg against the second unrelated mention of business_stats rather than the row you want to update.

Something like this is what you are after (warning not tested):

UPDATE business_stats AS b
SET b_total = agg.total
FROM
     (...) agg
WHERE b.pk = agg.pk;
Richard Huxton
  • 21,516
  • 3
  • 39
  • 51
  • It is true, the repeated presence of the busines_stats in the FROM make the query get crazy. I've restructured in the same way you show and it works fine. thank you – entalpia Apr 17 '17 at 17:28