49

(Note: My question has very similar concerns as the person who asked this question three months ago, but it was never answered.)

I recently started working with MVC3 + Entity Framework and I keep reading that the best practice is to use the repository pattern to centralize access to the DAL. This is also accompanied with explanations that you want to keep the DAL separate from the domain and especially the view layer. But in the examples I've seen the repository is (or appears to be) simply returning DAL entities, i.e. in my case the repository would return EF entities.

So my question is, what good is the repository if it only returns DAL entities? Doesn't this add a layer of complexity that doesn't eliminate the problem of passing DAL entities around between layers? If the repository pattern creates a "single point of entry into the DAL", how is that different from the context object? If the repository provides a mechanism to retrieve and persist DAL objects, how is that different from the context object?

Also, I read in at least one place that the Unit of Work pattern centralizes repository access in order to manage the data context object(s), but I don't grok why this is important either.

I'm 98.8% sure I'm missing something here, but from my readings I didn't see it. Of course I may just not be reading the right sources... :\

Community
  • 1
  • 1
Dave
  • 1,057
  • 2
  • 12
  • 18

7 Answers7

64

I think the term "repository" is commonly thought of in the way the "repository pattern" is described by the book Patterns of Enterprise Application Architecture by Martin Fowler.

A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction. Objects can be added to and removed from the Repository, as they can from a simple collection of objects, and the mapping code encapsulated by the Repository will carry out the appropriate operations behind the scenes.

On the surface, Entity Framework accomplishes all of this, and can be used as a simple form of a repository. However, there can be more to a repository than simply a data layer abstraction.

According to the book Domain Driven Design by Eric Evans, a repository has these advantages:

  • They present clients with a simple model for obtaining persistence objects and managing their life cycle
  • They decouple application and domain design from persistence technology, multiple database strategies, or even multiple data sources
  • They communicate design decisions about object access
  • They allow easy substitution of a dummy implementation, for unit testing (typically using an in-memory collection).

The first point roughly equates to the paragraph above, and it's easy to see that Entity Framework itself easily accomplishes it.

Some would argue that EF accomplishes the second point as well. But commonly EF is used simply to turn each database table into an EF entity, and pass it through to UI. It may be abstracting the mechanism of data access, but it's hardly abstracting away the relational data structure behind the scenes.

In simpler applications that mostly data oriented, this might not seem to be an important point. But as the applications' domain rules / business logic become more complex, you may want to be more object oriented. It's not uncommon that the relational structure of the data contains idiosyncrasies that aren't important to the business domain, but are side-effects of the data storage. In such cases, it's not enough to abstract the persistence mechanism but also the nature of the data structure itself. EF alone generally won't help you do that, but a repository layer will.

As for the third advantage, EF will do nothing (from a DDD perspective) to help. Typically DDD uses the repository not just to abstract the mechanism of data persistence, but also to provide constraints around how certain data can be accessed:

We also need no query access for persistent objects that are more convenient to find by traversal. For example, the address of a person could be requested from the Person object. And most important, any object internal to an AGGREGATE is prohibited from access except by traversal from the root.

In other words, you would not have an 'AddressRepository' just because you have an Address table in your database. If your design chooses to manage how the Address objects are accessed in this way, the PersonRepository is where you would define and enforce the design choice.

Also, a DDD repository would typically be where certain business concepts relating to sets of domain data are encapsulated. An OrderRepository may have a method called OutstandingOrdersForAccount which returns a specific subset of Orders. Or a Customer repository may contain a PreferredCustomerByPostalCode method.

Entity Framework's DataContext classes don't lend themselves well to such functionality without the added repository abstraction layer. They do work well for what DDD calls Specifications, which can be simple boolean expressions sent in to a simple method that will evaluate the data against the expression and return a match.

As for the fourth advantage, while I'm sure there are certain strategies that might let one substitute for the datacontext, wrapping it in a repository makes it dead simple.

Regarding 'Unit of Work', here's what the DDD book has to say:

Leave transaction control to the client. Although the REPOSITORY will insert into and delete from the database, it will ordinarily not commit anything. It is tempting to commit after saving, for example, but the client presumably has the context to correctly initiate and commit units of work. Transaction management will be simpler if the REPOSITORY keeps its hands off.

Eric King
  • 11,594
  • 5
  • 43
  • 53
  • 1
    This is a fantastic response, thanks. I picked up the book _Pro ASP.NET MVC3 Framework_ over the weekend because it goes DD from the ground up and talks much the same way. I do think this is the right way, I'm just trying to piece together the specific details, e.g. it wasn't until this weekend that it was finally clear the repository returns domain objects, not DAL objects. Prior research w/ EF really muddied the waters here, which led to me making "domain" objects out of EF partial classes, which just smells really bad. Thanks again for the comment, I'll re-read it periodically. – Dave Nov 05 '12 at 16:58
  • 1
    Good answer, I've always said the difference between a Repository and a DAO is that a Repository provides access to Aggregate Roots. It's a subtle but very important distinction. And Aggregate Roots are much more than simple DTOs. They are capable of doing real work in the domain. – Kyri Sarantakos Apr 01 '13 at 21:59
  • I think the confusion boils down to Entity Framework is perhaps not a full-on repository *as it is most commonly used*. However, there's nothing you've mentioned here that *can't* be done with a `DbContext`. – Chris Pratt Jul 08 '14 at 15:28
  • @ChrisPratt Sure, you can use a `DbContext` **in** a repository, but a `DbContext`, which by itself simply exposes a bunch of `DbSet`, is not a Repository. Even [the docs](http://msdn.microsoft.com/en-us/library/system.data.entity.dbcontext(v=vs.113).aspx) say "A DbContext instance represents a combination of the Unit Of Work and Repository patterns such that it can be used to query from a database and group together changes that will then be written back to the store as a unit." As my answer tries to illustrate, there's much more to a Repository than that. – Eric King Jul 08 '14 at 18:28
  • 1
    I understand that. `DbContext` is the UoW and each `DbSet` is a repository. But, and this is key, `DbContext` is just a class. You can put whatever properties you want in it, `DbSet` or not. Therefore you can subclass `DbSet` and create whatever methods you want on it, or you could roll your own based on `IDbSet` and use that in your context instead. You can directly interact with the database, providing methods that call stored procedures and return object collections that function like any old `DbSet`, etc. There's a ton of potential, that very few actually utilize. – Chris Pratt Jul 08 '14 at 18:43
  • 2
    @ChrisPratt Well, we somewhat agree. The question was, to paraphrase, "I have a context that calls itself a repository, but there must be more to it, what am I missing?". The answer is, all this other stuff you layer on top of your ORM to make it an *actual repository*. But when you say "each `DbSet` is a repository", I have to strongly disagree. It's simply an abstraction over a database table, but that does not make it a repository. – Eric King Jul 08 '14 at 18:56
26

Entity Framework's DbContext basically resembles a Repository (and a Unit of Work as well). You don't necessarily have to abstract it away in simple scenarios.

The main advantage of the repository is that your domain can be ignorant and independent of the persistence mechanism. In a layer based architecture, the dependencies point from the UI layer down through the domain (or usually called business logic layer) to the data access layer. This means the UI depends on the BLL, which itself depends on the DAL.

In a more modern architecture (as propagated by domain-driven design and other object-oriented approaches) the domain should have no outward-pointing dependencies. This means the UI, the persistence mechanism and everything else should depend on the domain, and not the other way around.

A repository will then be represented through its interface inside the domain but have its concrete implementation outside the domain, in the persistence module. This way the domain depends only on the abstract interface, not the concrete implementation.

That basically is object-orientation versus procedural programming on an architectural level.

See also the Ports and Adapters a.k.a. Hexagonal Architecture.

Another advantage of the repository is that you can create similar access mechanisms to various data sources. Not only to databases but to cloud-based stores, external APIs, third-party applications, etc.

Dennis Traub
  • 50,557
  • 7
  • 93
  • 108
  • At what point would it be beneficial to abstract it out? Fore example, I recently added a second entity model to my app. It points to the same database, but is only referencing views that I created to help populate some reporting screens. Is it better to move to a repository when you have more than one EF context in play? If so, should the repository object simply hold pointers to the contexts, or implement the full range of "FindById", "Save", etc for each type of object? Seems overkill to me. Thanks. – Dave Nov 01 '12 at 15:53
  • 2
    @Dave A separate context that provides views doesn't need a repository (and it doesn't need a domain as well, since there's probably not much business logic involved in displaying data). Just have the UI directly use that DbContext to retrieve the views, maybe filtered by user, group, or other restrictions. I'd like to recommend looking into the **CQRS** architectural style here (just google the term). I extended my answer above, trying to explain the purpose of a repository. – Dennis Traub Nov 01 '12 at 16:02
  • 1
    A Repository is much more than a layer of abstraction between the Domain Model and persistent store. It's a mechanism to ensure that Entities are obtained via their Aggregate Roots. Without that it's just another name for DAO no? – Kyri Sarantakos Apr 01 '13 at 22:02
  • 1
    Exactly. The repository (in DDD) translates persistence-related objects (table classes of an ORM, event streams, etc) into aggregates of your domain model (and back). Otherwise it's justa fancy name for a DAO. – Dennis Traub Apr 03 '13 at 11:22
  • @Dennis the ORM's job is to translate domain models into persistence-related objects. Make your domain models and then map them to database objects using Fluent configuration. There is no rule that says your EF entities have to match your DB tables. – Spivonious Jul 08 '15 at 17:15
6

You're right,in those simple cases the repository is just another name for a DAO and it brings only one value: the fact that you can switch EF to another data access technique. Today you're using MSSQL, tomorrow you'll want a cloud storage. OR using a micro orm instead of EF or switching from MSSQL to MySql.

In all those cases it's good that you use a repository, as the rest of the app won't care about what storage you're using now.

There's also the limited case where you get information from multiple sources (db + file system), a repo will act as the facade, but it's still a another name for a DAO.

A 'real' repository is valid only when you're dealing with domain/business objects, for data centric apps which won't change storage, the ORM alone is enough.

MikeSW
  • 16,140
  • 3
  • 39
  • 53
2

It would be useful in situations where you have multiple data sources, and want to access them using a consistent coding strategy.

For example, you may have multiple EF data models, and some data accessed using traditional ADO.NET with stored procs, and some data accessed using a 3rd party API, and some accessed from an Access database living on a Windows NT4 server sitting under a blanket of dust in your broom closet.

You may not want your business or front-end layers to care about where the data is coming from, so you build a generic repository pattern to access "data", rather than to access "Entity Framework data".

In this scenario, your actual repository implementations will be different from each other, but the code that calls them wouldn't know the difference.

Joe Enos
  • 39,478
  • 11
  • 80
  • 136
  • OK I totally understand this. But is the repository still creating and consuming EF (or ADO, or whatever) objects? Or is it supposed to create and consume "neutral" objects? i.e. Am I expected to create yet another layer of mapping just for the repository? If not, I am still passing around EF (...etc...) objects through my layers, so I'm not sure what I gain. – Dave Nov 01 '12 at 15:56
  • I'd say the implementation is totally up to you. Personally, I like translating my entities into plain objects (DTO/POCO/etc.) before giving them back to the business layer or front end, because my database model isn't necessarily my business model. But if you build your tables and entities to reflect exactly the data you need to pass among your layers, that would save you some trouble. But in my example, since some data comes from EF and some doesn't, it might get a little confusing if some repositories returned "Entities" and some didn't. – Joe Enos Nov 01 '12 at 16:00
  • Excellent point. I'm looking into building a domain model for my app right now. I'm also still using entities directly in views for my prototyping, but I'm trying to wean off that as well. – Dave Nov 01 '12 at 19:01
  • I don't understand what you mean when you say passing around EF objects. Those are just normal class objects with fields. They aren't special to EF. EF just maps the data to them, just like you would manually do if you were using ADO.NET. Then your service layer is uses these objects. The same class of data that would be used no matter what backend you had. This is why mocking them and testing works. You aren't using EF when testing yet it works because they are just normal C# objects that had data mapped into them. – user441521 May 02 '14 at 12:29
  • 2
    @user441521 In older EF versions, entity classes were special, deriving from `EntityObject`, self-tracking changes, and I believe strongly linked to the context - so you couldn't reference one of these classes from your other app tiers without also referencing EF and your context. Code-first, the POCO T4 templates, and the other massive overhauls they've done to EF may change that, but even then, your entity still represents "database data", which may not be what you want to expose to your other tiers (you may want to leave out some columns for example). – Joe Enos May 02 '14 at 17:35
2

Given your scenario, I would simply opt for a set of interfaces that represent what data structures (your Domain Models) need to be returned from your data layer. Your implementation can then be a mixture of EF, Raw ADO.Net or any other type of Data Store/Provider. The key strategy here is that the implementation is abstracted away from the immediate consumer - your Domain layer. This is useful when you want to unit test your domain objects and, in less common situations - change your data provider / database platform altogether.

You should, if you havent already, consider using an IOC container as they make loose coupling of your solution very easy by way of Dependency Injection. There are many available, personally i prefer Ninject.

The domain layer should encapsulate all of your business logic - the rules and requirements of the problem domain, and can be consumed directly by your MVC3 web application. In certain situations it makes sense to introduce a services layer that sits above the domain layer, but this is not always necessary, and can be overkill for straightforward web applications.

Baldy
  • 3,621
  • 4
  • 38
  • 60
  • Yes I haven't yet progressed to IoC but that is on my list of things to explore in the near future, along with ValueInjecter. Your comments about the domain model holding the business logic make a lot of sense. Thanks. – Dave Nov 01 '12 at 18:53
  • The most important thing is that you are being diligent and questioning the practicality of these patterns in the context of your specific solution, rather than blindly using them because they are a hot topic. Best of luck with your project. – Baldy Nov 02 '12 at 10:08
1

Another thing to consider is that even when you know that you will be working with a single data store it still might make sense to create a repository abstraction. The reason is that there might be a function that your application needs that your ORM du jour either does badly (performance), not at all, or you just don't know how to make the ORM bend to your needs.

If you are wrapping your ORM behind a well thought out repository interface, you can easily switch between different technologies as you see fit. It's not uncommon in my repositories to see some methods use EF for their work and others to use something like PetaPoco, or (gasp) ADO.net code. The repository abstraction enables you to use exactly the right tool for the job at hand without leaking these complexities into the client code.

cdaq
  • 167
  • 1
  • 9
  • Thanks for the comment. I think you nailed something I didn't see when I first asked the question. Once I used it for a bit the light bulb went off -- repositories return domain objects, not EF entities. So I wound up creating the domain repository and a set of domain classes, "duplicating work" but only if you think the job is to minimize the number of classes; in DDD that is not the case. It wound up being a lot more flexible and usable. And yeah, I did have some repository "helper" methods as well. Made for an easy place to do filtering, etc. – Dave Apr 01 '13 at 14:22
0

I think there is a big misunderstanding of what many articles call "repository." And that's why there are doubts about what real value those abstractions bring.

In my opinion the repository in it's pure form is IEnumerable, while you and many articles are talking about "data access service."

I've blogged about it here.

elixenide
  • 44,308
  • 16
  • 74
  • 100
Michael Logutov
  • 2,551
  • 4
  • 28
  • 32
  • I see your point, but I think the issue here is the different contexts used to define the term. Domain Driven Design apparently has a different definition of Repository than Fowler does, or at least the definition has expanded to encompass what you describe as a Data Access Service. Context is important. But your point on IEnumerable is interesting. In my case implementing a repository of domain models was profoundly freeing in terms of gains in flexibility, productivity, and power. Perhaps I could have implemented IEnumerable but that is water under the bridge now. :) – Dave Feb 18 '14 at 19:55
  • It's actually quite hard to implement repository how Fowler is describing it and to make it work on real database. There are only some ORMs today capable of come close to it, but no one ever created full implementation of many scenarios working with inmemory object collection with database in the backend. I have a plan to blog about it and why DDD is not well suited for 99% of all web applications. – Michael Logutov Feb 19 '14 at 05:10
  • I'm curious, did you ever write that blog post? – MarredCheese Jul 07 '21 at 21:25