I have an Entity Framework 4 design that allows referenced tables to be deleted (no cascade delete) without modifying the entities pointing to them. So for example entity A has a foreign key reference to entity B in the ID field. B can be deleted (and there are no FK constraints in the database to stop that), so if I look at A.B.ID it is always a valid field (since all this does is return the ID field in A) even if there is no record B with that ID due to a previous deletion. This is by design, I don't want cascading deletes, I need the A records to stick around for a while for auditing purposes.
The problem is that filtering out the non-existing deleted records is not as easy as it sounds. So for example if I do this:
from c in A
select A.B.somefield;
This results in a OUTER JOIN in the generated SQL so it's picking up all the A records even if they refer to missing B records. So, the hack I've been using to solve this (since I can't figure out a better way!) is do add a where clause to check a string field in the referenced B records. If that field in the B entity is null, then I assume B doesn't exist.
from c in A
where c.B.somestringfield != null
select A.B.somefield;
seems to work IF B.somestringfield is a string. If it is an integer, this doesn't work!
This is all such a hack to me. I've thought of a few solutions but they are just not practical:
- Query all tables that reference B when a B is deleted and null out their foreign keys. This is so ugly, I don't want to have to remember to do this if I add another entity that references B in the future. Not to mention a huge performace delay resolving all the references whenever I delete something.
- Add a string field to every table that I can count on being there that I can check to see if the entity exists. Blech, I don't want to add a database field just for this.
- Implement a soft delete and keep all the referencial integrity intact - essentially set up cascading deletes, but this is going to result is huge database bloat since I can't clean up a massive amount of records due to the references. No go.
I thought I had this problem licked with the "check if a field in the referenced entity is null" trick but it breaks under conditions that I don't completely understand (what if I don't have any strings in the referenced table? What kinds of fields will work? Integers won't.)
As an example if I have an integer field "count" in entity B and I check to see if it's null like:
from c in A
where c.B.count != null
select c.B.count;
I get a bunch of records with null for count mixed in with the results, and in fact the query bombs out with an "InvalidOperationException: The cast to value type 'Int32' failed because the materialized value is null. Either the result type's generic parameter or the query must use a nullable type."
So I need to do
from c in A
where c.B.count != null
select new { count = (int?)c.B.count };
to even see the null records. So this is pretty baffling to me how that query can result in null records in the results at all.
I just discovered something, if I do an explicit join like this, the SQL is INNER JOIN and everything works great:
from c in A
join j in B on A.B.ID equals j.ID
select c;
But this sucks. I'll have to modify a ton of queries to add explicit join clauses instead of enjoying the convenience of the relationship fields I get with the EF. Kinda defeats the purpose and adds a buch more code to maintain.