EntityFramework(Repository Pattern, Data Validation, Dto's)

Question

I've been having some rough time sorting out how to create an Restful API with EntityFramework. The problem is mainly because this API should be used in a long time forward and i want to be it as maintainable and clean, with good performance. Enough of that, let's get into the problem.

Disclamer: Because of companypolicy and can't post too much here, but i will try to address the problem the best way possible. There will also just be snippets of code, and may not be valid. I'm also fairly new to C# and as a JuniorD i have never touched an API before.. And excuse my english, this is my second language.

Every model derives from the BaseModel class

public class BaseModel
{
    [Required]
    public Guid CompanyId { get; set; }

    public DateTime CreatedDateTime { get; set; }

    [StringLength(100)]
    public string CreatedBy { get; set; }

    public DateTime ChangedDateTime { get; set; }

    [StringLength(100)]
    public string ChangedBy { get; set; }

    public bool IsActive { get; set; } = true;

    public bool IsDeleted { get; set; }
}

public class Carrier : BaseModel
{
    [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
    [Key]
    public Guid CarrierId { get; set; }

    public int CarrierNumber { get; set; }

    [StringLength(100)]
    public string CarrierName { get; set; }

    [StringLength(100)]
    public string AddressLine { get; set; }

    public Guid? PostOfficeId { get; set; }
    public PostOffice PostOffice { get; set; }
    public Guid? CountryId { get; set; }
    public Country Country { get; set; }

    public List<CustomerCarrierLink> CustomerCarrierLinks { get; set; }
}

Every repository derives from Repository and has it's own interface.

public class CarrierRepository : Repository<Carrier>, ICarrierRepository
{
    public CarrierRepository(CompanyMasterDataContext context, UnitOfWork unitOfWork) : base(context, unitOfWork) { }

    #region Helpers
    public override ObjectRequestResult<Carrier> Validate(Carrier carrier, List<string> errorMessages)
    {
        var errorMessages = new List<string>();

        if(carrier != null)
        {
            var carrierIdentifier = (carrier.CarrierName ?? carrier.CarrierNumber.ToString()) ?? carrier.CarrierGLN;

            if (string.IsNullOrWhiteSpace(carrier.CarrierName))
            {
                errorMessages.Add($"Carrier({carrierIdentifier}): Carrier name is null/empty");
            }
        }
        else
        {
            errorMessages.Add("Carrier: Cannot validate null value.");
        }

        return CreateObjectResultFromList(errorMessages, carrier); // nonsense
    }

}

UnitOfWork derives from class UnitOfWorkDiscoverySet, this class initializes the repository properties using reflection and also contains a method (OnBeforeChildEntityProcessed) for calling every OnBeforeChildEntityProcessed.

public class UnitOfWork : UnitOfWorkDiscoverySet
{
    public UnitOfWork(CompanyMasterDataContext context) 
        : base(context){}

    public CarrierRepository Carriers { get; internal set; }
    public PostOfficeRepository PostOffices { get; internal set; }
    public CustomerCarrierLinkRepository CustomerCarrierLinks { get; internal set; }
}


public IRepository<Entity> where Entity : BaseModel
{
ObjectRequestResult<Entity> Add(Entity entity);
ObjectRequestResult<Entity> Update(Entity entity);
ObjectRequestResult<Entity> Delete(Entity entity);
ObjectRequestResult<Entity> Validate(Entity entity);
Entity GetById(Guid id);
Guid GetEntityId(Entity entity);
}

public abstract class Repository<Entity> : IRepository<Entity> where Entity : BaseModel
{
    protected CompanyMasterDataContext _context;
    protected UnitOfWork _unitOfWork;

    public Repository(CompanyMasterDataContext context, UnitOfWork unitOfWork)
    {
        _context = context;
        _unitOfWork = unitOfWork;
    }

    public ObjectRequestResult<Entity> Add(Entity entity)
    {
        if (!EntityExist(GetEntityId(entity)))
        {
            try
            {
                var validationResult = Validate(entity);

                if (validationResult.IsSucceeded)
                {
                    _context.Add(entity);
                    _context.UpdateEntitiesByBaseModel(entity);
                    _context.SaveChanges();

                    return new ObjectRequestResult<Entity>()
                    {
                        ResultCode = ResultCode.Succceeded,
                        ResultObject = entity,
                        Message = OBJECT_ADDED
                    };
                }

                return validationResult;
            }
            catch (Exception exception)
            {
                return new ObjectRequestResult<Entity>()
                {
                    ResultCode = ResultCode.Failed,
                    ResultObject = entity,
                    Message = OBJECT_NOT_ADDED,
                    ErrorMessages = new List<string>()
                    {
                        exception?.Message,
                        exception?.InnerException?.Message
                    }
                };
            }
        }

        return Update(entity);
    }

    public virtual ObjectRequestResult Validate(Entity entity)
    {
        if(entity != null)
        {
            if(!CompanyExist(entity.CompanyId))
            {
                return EntitySentNoCompanyIdNotValid(entity); // nonsense
            }
        }

        return EntitySentWasNullBadValidation(entity); // nonsense
    }
}

DbContext class:

public class CompanyMasterDataContext : DbContext {

public DbSet<PostOffice> PostOffices { get; set; }
public DbSet<Carrier> Carriers { get; set; }

public DbSet<Company> Companies { get; set; }
public DbSet<CustomerCarrierLink> CustomerCarrierLinks { get; set; }



public UnitOfWork Unit { get; internal set; }

public CompanyMasterDataContext(DbContextOptions<CompanyMasterDataContext> options)
    : base(options)
{
    Unit = new UnitOfWork(this);
}

public void UpdateEntitiesByBaseModel(BaseModel baseModel)
{
    foreach (var entry in ChangeTracker.Entries())
    {
        switch (entry.State)
        {
            case EntityState.Added:
                entry.CurrentValues["CompanyId"] = baseModel.CompanyId;
                entry.CurrentValues["CreatedDateTime"] = DateTime.Now;
                entry.CurrentValues["CreatedBy"] = baseModel.CreatedBy;
                entry.CurrentValues["IsDeleted"] = false;
                entry.CurrentValues["IsActive"] = true;
                Unit.OnBeforeChildEntityProcessed(entry.Entity, enumEntityProcessState.Add);
                break;

            case EntityState.Deleted:
                entry.State = EntityState.Modified;
                entry.CurrentValues["ChangedDateTime"] = DateTime.Now;
                entry.CurrentValues["ChangedBy"] = baseModel.ChangedBy;
                entry.CurrentValues["IsDeleted"] = true;
                Unit.OnBeforeChildEntityProcessed(entry.Entity, enumEntityProcessState.Delete);
                break;

            case EntityState.Modified:
                if (entry.Entity != null && entry.Entity.GetType() != typeof(Company))
                    entry.CurrentValues["CompanyId"] = baseModel.CompanyId;

                entry.CurrentValues["ChangedDateTime"] = DateTime.Now;
                entry.CurrentValues["ChangedBy"] = baseModel.ChangedBy;

                Unit.OnBeforeChildEntityProcessed(entry.Entity, enumEntityProcessState.Update);
                break;
        }
    }
}

}

DiscoveryClass:

    public abstract class UnitOfWorkDiscoverySet
{
    private Dictionary<Type, object> Repositories { get; set; }
    private CompanyMasterDataContext _context;

    public UnitOfWorkDiscoverySet(CompanyMasterDataContext context)
    {
        _context = context;
        InitializeSets();
    }

    private void InitializeSets()
    {
        var discoverySetType = GetType();
        var discoverySetProperties = discoverySetType.GetProperties();

        Repositories = new Dictionary<Type, object>();

        foreach (var child in discoverySetProperties)
        {
            var childType = child.PropertyType;
            var repositoryType = childType.GetInterfaces()
                .Where( i => i.IsGenericType && i.GetGenericTypeDefinition() == typeof(IRepository<>))
                .FirstOrDefault();

            if (repositoryType != null)
            {
                var repositoryModel = repositoryType.GenericTypeArguments.FirstOrDefault();

                if (repositoryModel != null)
                {
                    if (repositoryModel.IsSubclassOf(typeof(BaseModel)))
                    {
                        var repository = InitializeProperty(child); //var repository = child.GetValue(this);

                        if (repository != null)
                        {
                            Repositories.Add(repositoryModel, repository);
                        }
                    }
                }
            }
        }
    }

    private object InitializeProperty(PropertyInfo property)
    {
        if(property != null)
        {
            var instance = Activator.CreateInstance(property.PropertyType, new object[] {
                _context, this
            });

            if(instance != null)
            {
                property.SetValue(this, instance);
                return instance;
            }
        }

        return null;
    }

    public void OnBeforeChildEntityProcessed(object childObject, enumEntityProcessState processState)
    {
        if(childObject != null)
        {
            var repository = GetRepositoryByObject(childObject);
            var parameters = new object[] { childObject, processState };

            InvokeRepositoryMethod(repository, "OnBeforeEntityProcessed", parameters);
        }
    }

    public void ValidateChildren<Entity>(Entity entity, List<string> errorMessages) where Entity : BaseModel
    {
        var children = BaseModelUpdater.GetChildModels(entity);

        if(children != null)
        {
            foreach(var child in children)
            {
                if(child != null)
                {
                    if (child.GetType() == typeof(IEnumerable<>))
                    {
                        var list = (IEnumerable<object>) child;

                        if(list != null)
                        {
                            foreach (var childInList in list)
                            {
                                ValidateChild(childInList, errorMessages);
                            }
                        }
                    }

                    ValidateChild(child, errorMessages);
                }
            }
        }
    }

    public void ValidateChild(object childObject, List<string> errorMessages)
    {
        if(childObject != null)
        {
            var repository = GetRepositoryByObject(childObject);
            var parameters = new object[] { childObject, errorMessages };

            InvokeRepositoryMethod(repository, "Validate", parameters);
        }
    }

    public void InvokeRepositoryMethod(object repository, string methodName, object[] parameters)
    {
        if (repository != null)
        {
            var methodToInvoke = repository.GetType().GetMethod(methodName);
            var methods = repository.GetType().GetMethods().Where(x => x.Name == methodName);

            if (methodToInvoke != null)
            {
                methodToInvoke.Invoke(repository, parameters);
            }
        }
    }

    public object GetRepositoryByObject(object objectForRepository)
    {
        return Repositories?[objectForRepository.GetType()];
    }

    public object GetObject<Entity>(Type type, Entity entity) where Entity : BaseModel
    {
        var childObjects = BaseModelUpdater.GetChildModels(entity);

        foreach (var childObject in childObjects)
        {
            if (childObject.GetType().FullName == type.FullName)
            {
                return childObject;
            }
        }

        return null;
    }
}

}

The problem: I want to validate data in every model and the childmodels properties/list, know you might say this can be done using attributes, but the validation can be fairly complex and i prefer to separate this in it's own space.

The way i have attacked this problem is by using reflection from the UnitDiscoverySet class, here i can find every child of the Entity i'm trying to process and search for an appropriate Repository containg the UnitOfWork. This works by all means, just need some more work and cleanup, but for some reason i feel like this is the cheating/wrong way to attack the problem, and i'm also not getting the compile-time errors + reflection comes at a cost.

I could validate children of the entity in the entity repository, but then i would be repeating myself all over the place, and this solution don't seem right either.

I don't want this solution to depend to heavy on entityframework, since it's not given that we will use this forever.

This solution also heavly depends on the UpdateEntitiesByBaseModel method inside The DbContext. So it only changes the fields that should be changed.

Not sure that i addressed this problem as good as thought i would, but i appreciate every contribution that will lead me down the right path. Thank you!

Solution(Edit): I ended up only using the navigation properties for GET operations, and excluded it for insert operations. Made everything more flexible and faster, this way i didn't need to use the EF Tracker, which made an insert operation of 5000 entities from a 13 minute operation, down to 14.3 seconds.

score 3 · Accepted Answer · answered May 08 '19 at 22:30

This question would probably be best asked in CodeReview rather than SO which is geared towards specific code-related questions. You can ask 10 different developers and get 10 different answers. :)

Reflection definitely does have a cost, and it's not something I like to use very much at all.

I don't want this solution to depend to heavy on entityframework, since it's not given that we will use this forever.

This is a fairly common theme I see in applications and frameworks that development teams I work with try to cope with when working with ORMs. To me, abstracting EF from the solution is like trying to abstract away parts of .Net. There is literally no point because you forfeit access to much of the flexibility and capability that Entity Framework offers. It leads to more, complex code to deal with things that EF can do natively leaving room for bugs as you re-invent the wheel, or leaving gaps that later have to be worked around. You either trust it, or you shouldn't use it.

I could validate children of the entity in the entity repository, but then i would be repeating myself all over the place, and this solution don't seem right either.

This is actually the pattern I've come to advocate for with projects. Many people are against the Repository pattern, but it is a great pattern to serve as a boundary for the domain for testing purposes. (No need to set up in-memory databases or attempt to mock a DbContext/DbSets) However, IMO the Generic Repository pattern is an anti-pattern. It separates entities concerns from one another, however in many cases we are dealing with entity "graphs" not individual entity types. Rather than defining repositories per entity, I opt for something that is effectively a repository per controller. (With repositories for truly common entities such as lookups for example.) There are two reasons behind this:

Fewer dependency references to pass around / mock
Better serves SRP
Avoid pigeon-holing DB operations

The biggest problem I have with generic or per-entity repositories is that while they appear to be conforming to SRP (responsible for operations for a single Entity) I feel they violate it because SRP is about having only one reason to change. If an I have an Order entity and an Order repository I may have several areas of the application that load and interact with orders. Methods to interact with Order entities now are called in several different places, and that forms many potential reasons for methods to be adjusted. You end up either with complex, conditional code, or several very similar methods to serve specific scenarios. (Orders for listing orders, orders by customer, orders by store, etc.) When it comes to validating entities, this is commonly done in context to the entire graph so it makes sense to centralize this in code related to the graph rather than individual entities. This goes for generic base operations like Add/Update/Delete. 80% of the time this works and saves effort, but that remaining 20% either has to get shoe-horned into the pattern, leading to inefficient and/or error-prone code, or work-arounds. K.I.S.S. should always trump D.N.R.Y when it comes to software design. Consolidation into base classes and the like is an optimization that should be done as the code evolves when "identical" functionality is identified. When it's done up-front as an architecture decision, I consider this premature optimization which creates obstacles for development, performance issues, and bugs when "similar" but not "identical" behaviour is grouped together leading to conditional code creeping in for edge cases.

So instead of an OrderRepository to serve orders, if I have something like a ManageOrderController, I will have a ManageOrderRepository to serve it.

For instance I like to use DDD styled methods for managing entities where my repositories play a part in the construction since they are privy to the data domain and can validate/retrieve related entities. So a typical repository would have:

IQueryable<TEntity> GetTEntities()
IQueryable<TEntity> GetTEntityById(id)
IQueryable<TRelatedEntity> GetTRelatedEntities()
TEntity CreateTEntity({all required properties/references})
void DeleteTEntity(entity)
TChildEntity CreateTChildEntity(TEntity, {all required properties/references})

Retrieval methods, including a "By ID" as it's a common scenario, return IQueryable so that callers can control how the data is consumed. This negates the need to try and abstract the Linq capabilities that EF can leverage so callers can apply filters, perform paging, sorting, then consume the data how they need. (Select, Any, etc.) The repository enforces core rules such as IsActive, and tenancy/authorization checks. This serves as a boundary for tests since mocks just have to return List<TEntity>.AsQueryable() or wrapped with a async-friendly collection type. (Unit-testing .ToListAsync() using an in-memory) The repository also serves as a go-to place for retrieving any related entities via whatever criteria is applicable. This can be viewed as potential duplication but changes to this repository will only be needed when the controller/view/area of the application needs to change. Common stuff like lookups would be pulled via their own repository. This cuts down on the need for lots of individual repository dependencies. Each area takes care of itself so changes/optimizations here don't need to consider or impact other areas of the application.

"Create" methods manage the rules around creating and associating entities to the Context to ensure that entities are always created in a minimally complete and valid state. This is where validation comes into play. Any value that is not null-able gets passed in, along with FKs (keys or references) needed to ensure that if SaveChanges() was the next call after Create, the entity would be valid.

"Delete" methods similarly come in here to manage validating data state/authorization, and apply consistent behaviour. (hard vs. soft delete, auditing, etc.)

I don't use "Update" methods. Updates are handled by DDD methods on the entity itself. Controllers define the unit of work, use the repository to retrieve the entity(ies), call the entity methods, then commit the unit of work. Validation can be done at the entity level, or via a Validator class.

In any case, that is just a summary of one approach out of 10+ you may get out there, and hopefully highlights some things to consider with whatever approach you take. My emphasis when it comes to working with EF would be:

Keep it simple. (K.I.S.S. > D.N.R.Y)
Leverage what EF has to offer rather than attempting to hide it.

Complex, clever code ultimately leads to more code, and more code leads to bugs, performance issues, and makes it difficult to adjust for requirements you haven't thought of up-front. (Leading to more complexity, more conditional paths, and more headaches) Frameworks like EF have been tested, optimized, and vetted so leverage them.

Thank you so much, Steve! Your point of view really helps turning my head in another direction. Much appreciated. — Mojo, May 09 '19 at 05:22
Does this mean as an example in the CarrierRepository you would create an instance of Ex: "PostOfficeValidator" and collect the results from here, and add to the main result? — Mojo, May 09 '19 at 06:04
If by editing/adding a Carrier you are editing or creating a PostOffice, or want to validate a post office against proposed properties of the carrier, then yeah, a post office validator could be referenced unless the PostOffice entity could validate itself against details provided from the carrier & related data. I try to keep validation as part of the entity where possible so that actions and validation of a top-level entity can include validation calls to its children as needed. The limit there is that validation cannot go to the DB for more detail. — Steve Py, May 09 '19 at 06:15
Ok, then we are on the same page. Because the entities used in this project are pretty complex, and almost every model need DB validation, i think i will have to stick with this solution. Thanks again. — Mojo, May 09 '19 at 06:24
One more question thats a little off topic, whats your thoughts on the navigation properties? And how expensive is this, when testing the add/update function on on some entities with a list containing 5000 entities there is a timespan on about 10-14 minutes.. By only linking a FK,will this improve the performance? — Mojo, May 09 '19 at 13:49
For larger batch operations I usually employ a bounded context with models that reflect the bare minimum viable record. If I'm validating the FKs by checking that they exist, or are creating child/related entities anyways then I keep references, otherwise I just use FK values and handle exceptions. For normal use entities I always use references /w shadow properties for FKs. For complex batch operations I usually stash the raw input into a table using EF (such as uploads) and have a background process chew through it (SSIS or code) writing to a status table to report progress and issues. — Steve Py, May 09 '19 at 21:20
It's worth noting that for test scenarios where you are actually reading/writing data, these would constitute integration tests rather than unit tests so they are expected to take longer to run. For integration tests I commonly look to take a copy of known data state as a backup then run a suite of tests against that data state, restoring between runs. For unit tests which you would want to run regularly during development, the Repository is the boundary. Rather than running against a DB, the repository is mocked and returns `IQueryable` with POCO entities I supply for the test. — Steve Py, May 09 '19 at 21:23

EntityFramework(Repository Pattern, Data Validation, Dto's)

1 Answers1