Repository Methods vs. Extending IQueryable

Question

I have repositories (e.g. ContactRepository, UserRepository and so forth) which encapsulate data access to the domain model.

When I was looking at searching for data, e.g.

finding a contact whose first name starts with XYZ
a contact whose birthday is after 1960

(etc),

I started implementing repository methods such as FirstNameStartsWith(string prefix) and YoungerThanBirthYear(int year), basically following the many examples out there.

Then I hit a problem - what if I have to combine multiple searches? Each of my repository search methods, such as above, only return a finite set of actual domain objects. In search for a better way, I started writing extension methods on IQueryable<T>, e.g. this:

public static IQueryable<Contact> FirstNameStartsWith(
               this IQueryable<Contact> contacts, String prefix)
{
    return contacts.Where(
        contact => contact.FirstName.StartsWith(prefix));
}

Now I can do things such as

ContactRepository.GetAll().FirstNameStartsWith("tex").YoungerThanBirthYear(1960);

However, I found myself writing extension methods (and inventing crazy classes such as ContactsQueryableExtensions all over, and I lose the "nice grouping" by having everything in the appropriate repository.

Is this really the way to do it, or is there a better way to achieve the same goal?

score 12 · Answer 1 · answered Sep 13 '09 at 01:20

I have been thinking about this a lot lately, after starting at my current job. I am used to Repositories, they go the full IQueryable path using just bare bones repositories as you suggest.

I feel the repo pattern is sound and does a semi-effective job at describing how you want to work with the data in the application domain. However the issue you are describing definitely occurs. It gets messy, fast, beyond a simple application.

Are there, perhaps, ways to rethink why you are asking for the data in so many ways? If not, I really feel that a hybrid approach is the best way to go. Create repo methods for the stuff you reuse. Stuff that actually it makes sense for. DRY and all that. But those one-offs? Why not take advantage of IQueryable and the sexy things you can do with it? It is silly, as you said, to create a method for that, but it doesn't mean you don't need the data. DRY doesn't really apply there does it?

It would take discipline to do this well, but I really think it's an appropriate path.

score 8 · Accepted Answer · edited Jun 20 '20 at 09:12

8

@Alex - i know this is an old question, but what I would be doing would be letting the Repository do really simple stuff only. This means, get all records for a table or view.

Then, in the SERVICES layer (you are using an n-tiered solution, right? :) ) i would be handling all the 'special' query stuff there.

Ok, example time.

Repository Layer

ContactRepository.cs

public IQueryable<Contact> GetContacts()
{
    return (from q in SqlContext.Contacts
            select q).AsQueryable();
}

Nice and simple. SqlContext is the instance of your EF Context .. which has an Entity on it called Contacts .. which is basically your sql Contacts class.

This means, that method basically is doing: SELECT * FROM CONTACTS ... but it's not hitting the database with that query .. it's only a query right now.

Ok .. next layer.. KICK ... up we go (Inception anyone?)

Services Layer

ContactService.cs

public  ICollection<Contact> FindContacts(string name)
{
    return FindContacts(name, null)
}

public ICollection<Contact> FindContacts(string name, int? year)
{
   IQueryable<Contact> query = _contactRepository.GetContacts();
   
   if (!string.IsNullOrEmpty(name))
   {
       query = from q in query
               where q.FirstName.StartsWith(name)
               select q;
   }

   if (int.HasValue)
   {
       query = from q in query
               where q.Birthday.Year <= year.Value
               select q);
    }

    return (from q in query
            select q).ToList();
}

Done.

So lets recap. First, we start our with a simple 'Get everything from contacts' query. Now, if we have a name provided, lets add a filter to filter all contacts by name. Next, if we have a year provided, then we filter the birthday by Year. Etc. Finally, we then hit the DB (with this modified query) and see what results we get back.

NOTES:-

I've omitted any Dependency Injection for simplicity. It's more than highly recommended.
This is all pseduo-code. Untested (against a compiler) but you get the idea ....

Takeaway points

The Services layer handles all the smarts. That is where you decide what data you require.
The Repository is a simple SELECT * FROM TABLE or a simple INSERT/UPDATE into TABLE.

Good luck :)

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 02 '10 at 01:25

Pure.Krome

84,693
113
396
647

1

I think it is worth mentioning that this really only works efficiently, and please correct me if I am wrong, if you are using a DAL that supports deferred execution such as Linq To Sql as per your example. Otherwise you will retrieve a lot of data from your data store that may not get used. Now if you know the user is going to be using the same data set in multiple different ways this ultimately may be ok, but if they are just running this query once then retrieving entirely unrelated data this design would lead to mentionable overhead. – joshlrogers Aug 02 '10 at 01:36
1

We've all done this hack. For each additional filter we add an 'optional' parameter. I think the authors original example is far cleaner and easier to use (and reuse). – Jerod Houghtelling Aug 02 '10 at 01:42
1

@joshlrogers : dude - waaaay incorrect! DO you know what an IQueryable is/does? it doesn't hit the DB. It's a _query_. So i'm not really doing a SELECT * FROM XXX. I *extend* that query by adding extra where clauses .. when required. So the final sql is actually a SELECT * FROM XXX WHERE blah (if the user has provided the *optional* name and/or *optional* year arguments*. So unless I misunderstood you .. careful what you say :( – Pure.Krome Aug 02 '10 at 02:26
1

@Jerod Houghtelling : hack? How is this a hack? With the author's original example, they will have a very complex repository (as he/she suggested).. and then this is sorta-replicated in the services layer. Not very DRY if you ask me. Want a more indepth example of my answer? Check this good blog post out: http://huyrua.wordpress.com/2010/07/13/entity-framework-4-poco-repository-and-specification-pattern/ – Pure.Krome Aug 02 '10 at 02:28
@Pure.Krome That is exactly what I am saying. Your code example used execution deferred code. Linq isn't executed until it is actually called so at that point everything creates one coherent query. I was only bringing this up because if they try to home brew something they should know that it isn't good practice to get an entire table of data then perform all filtering, sorting, etc, afterwards except in rare circumstances. I was just trying to make sure people understood what was happening. – joshlrogers Aug 02 '10 at 02:51
@Joshlrogers : i'm so confused :( Are you saying that my code is (1) using deferred execution and (2) returns all rows to code (massive overhead) then client code does filtering/sorting or (3) returns only the exact rows required because filtering/sorting is part of the sql code executed? (My answers are (1) and (3) ... definately not (2). – Pure.Krome Aug 02 '10 at 04:35
@Pure.Krome : Sorry I didn't mean to be confusing. Your code is deferred execution because you are utilizing linq. No you aren't experiencing 2 due to the nature of deferred execution the sql isn't generated until the linq query is actually executed with a ToList(). I wasn't trying to say your code was wrong at all, quite the contrary. What I was trying to say to anyone reading your post that doesn't happen to be using an ORM or DAL that supports deferred execution that they would be doing # 2 if they followed your instructions. – joshlrogers Aug 02 '10 at 11:26
1

Gotcha now :) and yep - agreed. But i didn't worry about that 'cause this was an EF question :) – Pure.Krome Aug 02 '10 at 14:38

Rob · Answer 3 · 2012-02-15T17:42:28.700

I realize this is old, but I've been dealing with this same issue lately, and I came to the same conclusion as Chad: with a little discipline, a hybrid of extension methods and repository methods seems to work best.

Some general rules I've been following in my (Entity Framework) application:

Ordering queries

If the method is used only for ordering, I prefer to write extension methods that operate on IQueryable<T> or IOrderedQueryable<T> (to leverage the underlying provider.) e.g.

public static IOrderedQueryable<TermRegistration> ThenByStudentName(
    this IOrderedQueryable<TermRegistration> query)
{
    return query
        .ThenBy(reg => reg.Student.FamilyName)
        .ThenBy(reg => reg.Student.GivenName);
}

Now I can use ThenByStudentName() as needed within my repository class.

Queries returning single instances

If the method involves querying by primitive parameters, it usually requires an ObjectContext and can't be easily made static. These methods I leave on my repository, e.g.

public Student GetById(int id)
{
    // Calls context.ObjectSet<T>().SingleOrDefault(predicate) 
    // on my generic EntityRepository<T> class 
    return SingleOrDefault(student => student.Active && student.Id == id);
}

However, if the method instead involves querying an EntityObject using its navigation properties, it can usually be made static quite easily, and implemented as an extension method. e.g.

public static TermRegistration GetLatestRegistration(this Student student)
{
    return student.TermRegistrations.AsQueryable()
        .OrderByTerm()
        .FirstOrDefault();
}

Now I can conveniently write someStudent.GetLatestRegistration() without needing a repository instance in the current scope.

Queries returning collections

If the method returns some IEnumerable, ICollection or IList, then I like to make it static if possible, and leave it on the repository even if it uses navigation properties. e.g.

public static IList<TermRegistration> GetByTerm(Term term, bool ordered)
{
    var termReg = term.TermRegistrations;
    return (ordered)
        ? termReg.AsQueryable().OrderByStudentName().ToList()
        : termReg.ToList();
}

This is because my GetAll() methods already live on the repository, and it helps to avoid a cluttered mess of extension methods.

Another reason for not implementing these "collection getters" as extension methods is that they would require more verbose naming to be meaningful, since the return type isn't implied. For example, the last example would become GetTermRegistrationsByTerm(this Term term).

I hope this helps!

Meaningful naming, aka *Ubiquitous Language* is a key goal and benefit for a services layer. **GetRegistrationsByTerm()** is a good example. — one.beat.consumer, Mar 10 '16 at 19:55

score 0 · Answer 4 · answered Mar 10 '16 at 19:44

Six years later, I am certain @Alex has solved his problem, but after reading the accepted answer I wanted to add my two cents.

The general purpose of extending IQueryable collections in a repository to provide flexibility and empower its consumers to customize data retrieval. What Alex has already done is good work.

The primary role of a service layer is to adhere to the separation of concerns principle and address command logic associated with business function.

In real world applications, query logic often needs no extension beyond the retrieval mechanics provided by the repository itself (ex. value alterations, type conversions).

Consider the two following scenarios:

IQueryable<Vehicle> Vehicles { get; }

// raw data
public static IQueryable<Vehicle> OwnedBy(this IQueryable<Vehicle> query, int ownerId)
{
    return query.Where(v => v.OwnerId == ownerId);
}

// business purpose
public static IQueryable<Vehicle> UsedThisYear(this IQueryable<Vehicle> query)
{
    return query.Where(v => v.LastUsed.Year == DateTime.Now.Year);
}

Both methods are simple query extensions, but however subtle they have different roles. The first is a simple filter, while the second implies business need (ex. maintenance or billing). In a simple application one might implement them both in a repository. In a more idealistic system UsedThisYear is best suited for the service layer (and may even be implemented as a normal instance method) where it may also better facilitate CQRS strategy of separating commands and queries.

Key considerations are (a) the primary purpose of your repository and (b) how much do you like to adhere to CQRS and DDD philosophies.