3

I'm trying to delete all the organizations that no longer have any users.

Using the below code, I can find all the records I wish to delete:

Organization.includes(:users)
  .where(users: { id: nil })
  .references(:users)

When I add delete_all, I get the same error I would get if I didn't include references:

PG::UndefinedTable: ERROR:  missing FROM-clause entry for table "users"

I could probably write the solution in pure SQL, but I don't understand why Rails isn't keeping the reference to users when I add the delete_all statement.

Here are some more details:

Organization:
  has_many :users

User:
  belongs_to :organization
Tom Prats
  • 7,364
  • 9
  • 47
  • 77
  • should not it be `User.includes(:organizations)...`? – IvanSelivanov Oct 16 '15 at 21:53
  • I updated the question, I switched the relation around a bit too – Tom Prats Oct 16 '15 at 22:01
  • Could you post the code that defines the association between Organizations and Users please? – Thomas Hennes Oct 16 '15 at 22:24
  • I updated it with some more details. Let me know if you'd want anything else – Tom Prats Oct 16 '15 at 22:26
  • you'll find an answer to your question if you execute your query in rails console and see what sql it produces. it tries to `DELETE FROM organizations WHERE ...`. It does not solve Your problem, though :( Also, I don't understand why you use `references()`. All of my similar queries work perfectly without it. I. e. `Product.includes(:category_product_relationships,parent_relationships).where(category_product_relationships: { id: nil }, product_relationships: { parent_id: nil })` – IvanSelivanov Oct 16 '15 at 22:44
  • `references` is only useful when the `where` clause contains an SQL string if you want your `includes` to generate a join instead of 2 separate requests. But that's really not the problem here. – Thomas Hennes Oct 16 '15 at 22:47
  • google shows a bunch of threads like this one: https://github.com/rails/rails/issues/919 so it seems to be a 5 years old bug in rails :) – IvanSelivanov Oct 16 '15 at 23:08
  • 2
    @IvanSelivanov that specific one's been fixed in Rails 4. It's of no use though, since `joins` does an `INNER JOIN` that doesn't catch organizations without users. It's a kind of a bug that's really "absence of a feature", and features have to be designed first... – D-side Oct 16 '15 at 23:37

2 Answers2

2

I've found the includes useful only for eager loading (and it can rarely handle my cases), and when coupled with references it generates something completely insane (aliasing every single field with something like tN_rM) even though it actually does a LEFT OUTER JOIN... Which could help if it didn't vanish once delete_all appears!

I've found that it's much clearer and simpler just to use exists. It's Arel (and there's no point in avoiding it, its under the hood of ActiveRecord anyway), but it's such a tiny portion that it's barely noticeable:

Organization.where(
  User.where('users.organization_id = organizations.id').exists.not
)

Or, if this string of SQL doesn't look nice to you, use a bit more Arel, so it gets noticeable:

Organization.where(
  User.where(organization_id: Organization.arel_table[:id]).exists.not
) # I tend to extract these   ^^^^^^^^^^^^^^^^^^^^^^^ into local variables

That handles chaining .delete_all on top just fine, since it's not (syntactically) a join, even though it's effectively equivalent to one.

The magic behind this

SQL has an EXISTS operator that is similar in functionality to a join, except for inability of selecting fields from a joined table. It forms a valid boolean expression which can be negated and thrown into WHERE-conditions.

In the "SQL-free" form I'm using an expression "column of a table", which turns out to be usable in Rails' hash-conditions. It's an accidental discovery, one of the few uses of Arel that does not make code too bulky.

D-side
  • 9,150
  • 3
  • 28
  • 44
  • This looks pretty cool but I haven't seen the nested syntax much. How does the `Organization.where` know what to get? To me it looks like the `User` query would return users, but I guess it being a subquery changes its functionality? – Tom Prats Oct 17 '15 at 01:24
  • Trying it out but I'm getting 0 records back. I think it might be because the inside query is returning no users back (because they don't exist), and then the outer query is getting the organizations of 0 users – Tom Prats Oct 17 '15 at 01:39
  • @TMP In this case, the inner query alone is useless, because referencing another table in `where` without a join implicitly does `INNER JOIN`. However, using it inside "`EXISTS` makes it behave differently. This syntax is quite hard to come by, I've found it by accident myself (used `exists` instead of `exists?` and wow!). Anyway, should work, could you paste the resulting SQL? I was testing it on different models. – D-side Oct 17 '15 at 06:39
  • @TMP And, uhm, it should only work as an implicit inner join with an appropriate `FROM`-clause which I didn't fill in. So the query should fire an SQL syntax error, how did you make it return 0 users? O_o – D-side Oct 17 '15 at 06:49
  • I tried again today and it worked! I think before I had already deleted all the records, so of course it was returning 0 records – Tom Prats Oct 19 '15 at 14:54
  • @TMP I was under impression that using an inner query by itself would result in an "undefined column" error, regardless of users' existence. All right then! – D-side Oct 19 '15 at 17:06
0

I'm not sure how you plan to implement this in the MVC framework, but it seems clean to do the organization purge via model action. Whenever a user is deleted, check to see of the organization has any remaining members.

in the User.rb

class User < ActiveRecord::Base
  before_destroy :close_user
  ...

 def user_organization
   Organization.where(user_id: id)
 end

  private
    def close_user
        unless user_organization.users.any? 
          user_organization.destroy
        end
    end

end

Added To apply callback delete solution to users being member of many organizations

If the user has multiple organizations

class User < ActiveRecord::Base
      before_destroy :close_user
      ...

     def user_organizations
       Organization.where(user_id: id)
     end

      private
        def close_user
            user_organization.find_each do |organization| 
            unless organization.users.any? 
              organization.destroy
            end
        end

    end

Caveat: this is not tested, didn't fail syntax. I don't have the data to test it fully but I think it will work. But it means running this action after every user delete, which is a system architecture decision. If it's an option, it might be worth a try.

Elvn
  • 3,021
  • 1
  • 14
  • 27
  • I believe in TMP's case, the relation is established by the foreign key `organization_id` in the `users` table, not the other way around. – Thomas Hennes Oct 16 '15 at 23:08
  • @jaxx I take your [point, but maybe you'd agree that checking the organization on user delete makes sense. I'd probably update the relationship in order to implement a clean solution for the zero-user organization delete. – Elvn Oct 16 '15 at 23:28
  • oh yes, definitely agree on the callback, it's a smart implementation and would actually save you the trouble of having to look for user-less organizations at a later stage ;) – Thomas Hennes Oct 16 '15 at 23:32
  • Smart move, yes. Provided that no empty organizations exist at this point, this should maintain this state. Except... I'm not sure, but I'm curious to see what happens when in two parallel transactions (a and b) in (a) last user is deleted -> organization is wiped and (b) a user is added to that organization. It's such a corner-case though... – D-side Oct 17 '15 at 00:04
  • The problem with the callback is that you're usually calling `destroy` or `destroy_all` to invoke it. In my case there are over 10 related models each with their own destroy callbacks. I simplified my question a little bit but what's really happening is that I'm deleting an organization. In a `before_destroy` callback it deletes all the related models: UsersOrganizations, Resources, Options, Data, and more things like that. – Tom Prats Oct 17 '15 at 01:13
  • When it gets to UsersOrganizations, that record is linking a user to an organization. A user can belong to multiple organizations through this (join) table. After the UsersOrganizations records are deleted, I then run a query to delete the users if they have no more UsersOrganizations. Normally I would delete the user through a callback on UsersOrganization, but since I'm potentially deleting millions of records, I use `delete_all` and then manually delete the related records. – Tom Prats Oct 17 '15 at 01:13
  • @TMP, I can see your point that your need to manually delete records will not allow this type of callback solution. But I went ahead and added a section to document a callback that will delete all the organizations affiliated with a given user. Maybe this will help someone else who may come along with a similar problem. – Elvn Oct 17 '15 at 14:36