I have a relatively large (as in >10^6 entries) table called "things" which represent locateable objects, e.g. countries, areas, cities, streets, etc. They are used as a tree of objects with a fixed depth, so the table structure looks like this:
id
name
type
continent_id
country_id
city_id
area_id
street_id
etc.
The association inside "things" is 1:n, i.e. a street or area always belongs to a defined city and country (not two or none); the column city_id for example contains the id of the "city" thing for all the objects which are inside that city. The "type" column contains the type of thing (Street, City, etc) as a string.
This table is referenced in another table "actions" as "thing_id". I am trying to generate a table of action location statistics showing the number of active and inactive actions a given location has. A simple JOIN like
SELECT count(nullif(actions.active, 1)) AS icount, count(nullif(actions.active, 0)) AS acount, things.name AS name, things.id AS thing_id, things.city_id AS city_id FROM "actions" LEFT JOIN things ON actions.thing_id = things.id WHERE UPPER(substring(things.name, 1, 1)) = UPPER('A') AND actions.datetime_at BETWEEN '2012-09-26 19:52:14' AND '2012-10-26 22:00:00' GROUP BY things.name, things.id ORDER BY things.name
will give me a list of "things" (starting with 'A') which have actions associated with them and their active and inactive count like this:
icount | acount | name | thing_id | city_id
------------------------------------------------------------------
0 5 Brooklyn, New York City | 25 | 23
1 0 Manhattan, New York City | 24 | 23
3 2 New York City | 23 | 23
Now I would like to
- only consider "city" things (that's easy: filter by type in "things"), and
- in the active/inactive counts, use the sum of all actions happening in this city - regardless of whether the action is associated with the city itself or something inside the city (= having the same city_id). With the same dataset as above, the new query should result in
icount | acount | name | thing_id | city_id ------------------------------------------------------------------ 4 7 New York City | 23 | 23
I do not need the thing_id in this table (since it would not be unique anyway), but since I do need the city's name (for display), it is probably just as easy to also output the ID, then I don't have to change as much in my code.
How would I have to modify the above query to achieve this? I'd like to avoid additional trips to the database, and advanced SQL features such as procedures, triggers, views and temporary tables, if possible.
I'm using Postgres 8.3 with Ruby 1.9.3 on Rails 3.0.14 (on Mac OS X 10.7.4).
Thank you! :)