There is a question about IRepository and what it is used for, that has a seemingly good answer.
My problem though: How would I cleanly deal with entities that are related to each other, and isn't IRepository then just a layer without real purpose?
Let's say I have these business objects:
public class Region {
public Guid InternalId {get; set;}
public string Name {get; set;}
public ICollection<Location> Locations {get; set;}
public Location DefaultLocation {get; set;}
}
public class Location {
public Guid InternalId {get; set;}
public string Name {get; set;}
public Guid RegionId {get; set;}
}
There are rules:
- Every Region MUST have at least one location
- Newly created Regions are created with a location
- No SELECT N+1 please
So how would my RegionRepository look like?
public class RegionRepository : IRepository<Region>
{
// Linq To Sql, injected through constructor
private Func<DataContext> _l2sfactory;
public ICollection<Region> GetAll(){
using(var db = _l2sfactory()) {
return db.GetTable<DbRegion>()
.Select(dbr => MapDbObject(dbr))
.ToList();
}
}
private Region MapDbObject(DbRegion dbRegion) {
if(dbRegion == null) return null;
return new Region {
InternalId = dbRegion.ID,
Name = dbRegion.Name,
// Locations is EntitySet<DbLocation>
Locations = dbRegion.Locations.Select(loc => MapLoc(loc)).ToList(),
// DefaultLocation is EntityRef<DbLocation>
DefaultLocation = MapLoc(dbRegion.DefaultLocation)
}
}
private Location MapLoc(DbLocation dbLocation) {
// Where should this come from?
}
}
So as you see, a RegionRepository needs to fetch locations as well. In my example, I use Linq To Sql EntitySet/EntiryRef, but now Region needs to deal with mapping Locations to Business Objects (because I have two sets of objects, business and L2S objects).
Should I refactor this to something like:
public class RegionRepository : IRepository<Region>
{
private IRepository<Location> _locationRepo;
// snip
private Region MapDbObject(DbRegion dbRegion) {
if(dbRegion == null) return null;
return new Region {
InternalId = dbRegion.ID,
Name = dbRegion.Name,
// Now, LocationRepo needs to concern itself with Regions...
Locations = _locationRepo.GetAllForRegion(dbRegion.ID),
// DefaultLocation is a uniqueidentifier
DefaultLocation = _locationRepo.Get(dbRegion.DefaultLocationId)
}
}
Now I have nicely separated my data layer into atomic repositories, only dealing with one type each. I fire up the Profiler and... Whoops, SELECT N+1. Because each Region calls the location service. We only have a dozen regions and 40 or so location, so the natural optimization is to use DataLoadOptions. The problem is that RegionRepository doesn't know if LocationRepository is using the same DataContext or not. We are injecting factories here after all, so LocationRepository might spin up it's own. And even if it doesn't - I'm calling a service method that provides business objects, so the DataLoadOptions may not be used anyway.
Ah, I overlooked something. IRepository is supposed to have a method like this:
public IQueryable<T> Query()
So now I would do
return new Region {
InternalId = dbRegion.ID,
Name = dbRegion.Name,
// Now, LocationRepo needs to concern itself with Regions...
Locations = _locationRepo.Query()
.Select(loc => loc.RegionId == dbRegion.ID)
.ToList(),
// DefaultLocation is a uniqueidentifier
DefaultLocation = _locationRepo.Get(dbRegion.DefaultLocationId)
}
That looks good. At first. On second inspection,I have separate business and L2S objects, so I still don't see how this avoids SELECT N+1 since Query can not just return GetTable<DbLocation>
.
The problem seems to be having two different sets of objects. But if I decorate Business Objects with all the System.Data.LINQ attributes ([Table], [Column] etc.), that breaks the abstraction and defeats the purpose of IRepository. Because maybe I want to also be able to use some other ORM, at which point I would now have to decorate my Business Entities with other attributes (also, if the business entities are in a separate .Business assembly, consumers of it now need to reference all ORMs as well for the attributes to be resolved - yuck!).
To me, it seems that IRepository should be IService, and the above class should look like this:
public class RegionService : IRegionService {
private Func<DataContext> _l2sfactory;
public void Create(Region newRegion) {
// Responsibility 1: Business Validation
// This could of course move into the Region class as
// a bool IsValid(), but that doesn't change the fact that
// the service concerns itself with validation
if(newRegion.Locations == null || newRegion.Locations.Count == 0){
throw new Exception("...");
}
if(newRegion.DefaultLocation == null){
newRegion.DefaultLocation = newRegion.Locations.First();
}
// Responsibility 2: Data Insertion, incl. Foreign Keys
using(var db = _l2sfactory()){
var dbRegion = new DbRegion {
...
}
// Use EntitySet to insert Locations as well
foreach(var location in newRegion.Locations){
var dbLocation = new DbLocation {
}
dbRegion.Locations.Add(dbLocation);
}
// Insert Region AND all Locations
db.InsertOnSubmit(dbRegion);
db.SubmitChanges();
}
}
}
This also solves a chicken-egg problem:
- DbRegion.ID is generated by the database (as newid()) and IsDbGenerated = true is set
- DbRegion.DefaultLocationId is a non-nullable GUID
- DbRegion.DefaultLocationId is a FK into Location.ID
- DbLocation.RegionId is a non-nullable GUID and a FK into Region.ID
Doing this without EntitySet is pretty much impossible, so unless you sacrifice data integrity on the database and move it into the business logic, it's impossible to keep responsibility about Locations out of the Region provider.
I see how this posting can be seen as not a real question, subjective and argumentative, so please allow me to formulate a objective questions:
- What exactly is the Repository Pattern supposed to abstract away?
- In the real world, how do people optimize their database layer without breaking the abstraction the Repository Pattern is supposed to achieve?
- Specifically, how does the real world deal with SELECT N+1 and data integrity concerns?
I guess my real question is this:
- When already using an ORM (like Linq To Sql), isn't DataContext already my Repository, and thus a Repository on top of DataContext is just abstracting the very same thing again?