EF4: Filtering out referenced entities that do not exist

Question

I have an Entity Framework 4 design that allows referenced tables to be deleted (no cascade delete) without modifying the entities pointing to them. So for example entity A has a foreign key reference to entity B in the ID field. B can be deleted (and there are no FK constraints in the database to stop that), so if I look at A.B.ID it is always a valid field (since all this does is return the ID field in A) even if there is no record B with that ID due to a previous deletion. This is by design, I don't want cascading deletes, I need the A records to stick around for a while for auditing purposes.

The problem is that filtering out the non-existing deleted records is not as easy as it sounds. So for example if I do this:

from c in A
select A.B.somefield;

This results in a OUTER JOIN in the generated SQL so it's picking up all the A records even if they refer to missing B records. So, the hack I've been using to solve this (since I can't figure out a better way!) is do add a where clause to check a string field in the referenced B records. If that field in the B entity is null, then I assume B doesn't exist.

from c in A
where c.B.somestringfield != null
select A.B.somefield;

seems to work IF B.somestringfield is a string. If it is an integer, this doesn't work!

This is all such a hack to me. I've thought of a few solutions but they are just not practical:

Query all tables that reference B when a B is deleted and null out their foreign keys. This is so ugly, I don't want to have to remember to do this if I add another entity that references B in the future. Not to mention a huge performace delay resolving all the references whenever I delete something.
Add a string field to every table that I can count on being there that I can check to see if the entity exists. Blech, I don't want to add a database field just for this.
Implement a soft delete and keep all the referencial integrity intact - essentially set up cascading deletes, but this is going to result is huge database bloat since I can't clean up a massive amount of records due to the references. No go.

I thought I had this problem licked with the "check if a field in the referenced entity is null" trick but it breaks under conditions that I don't completely understand (what if I don't have any strings in the referenced table? What kinds of fields will work? Integers won't.)

As an example if I have an integer field "count" in entity B and I check to see if it's null like:

from c in A
where c.B.count != null
select c.B.count;

I get a bunch of records with null for count mixed in with the results, and in fact the query bombs out with an "InvalidOperationException: The cast to value type 'Int32' failed because the materialized value is null. Either the result type's generic parameter or the query must use a nullable type."

So I need to do

from c in A
where c.B.count != null
select new { count = (int?)c.B.count };

to even see the null records. So this is pretty baffling to me how that query can result in null records in the results at all.

I just discovered something, if I do an explicit join like this, the SQL is INNER JOIN and everything works great:

from c in A
join j in B on A.B.ID equals j.ID
select c;

But this sucks. I'll have to modify a ton of queries to add explicit join clauses instead of enjoying the convenience of the relationship fields I get with the EF. Kinda defeats the purpose and adds a buch more code to maintain.

score 1 · Accepted Answer · edited May 23 '17 at 12:11

When you say that your first code snippet creates an OUTER JOIN then it's the case because B is an optional navigation property of entity A. For a required navigation property EF would create an INNER JOIN (explained in more detail here: https://stackoverflow.com/a/7640489/270591).

So, the only alternative I see to your last code snippet (using explicit join in LINQ) - aside from using direct SQL - is to make your navigation property required.

This is still a very ugly hack in my opinion which might have unexpected behaviour in other situations. If a navigation property is required or optional EF adds a "semantic meaning" to this relationship which is: If there is a foreign key != NULL there must be a related entity and EF expects that you don't have removed the enforcement of the FK constraint in the database.

Thank you this is very useful info. I didn't really understand optional vs. required navigation properties, but it certainly does make sense if that is simply mapped to the FK constraints. In my case I totally agree I just abhor breaking the semantic meaning of required since that constraint it is NOT there by design, and if it causes possible unexpected weird behavior, forget it. It looks to me like the best way to go here is to modify certain queries with an explicit LINQ join to force the generated INNER JOIN and get the correct results in those cases. Thanks! — Eric Sassaman, Dec 14 '11 at 17:36

EF4: Filtering out referenced entities that do not exist

1 Answers1