1

If I have 4 thousand documents (maybe quite a few more, haven't checked) I can use a Func<T,bool> in IDocumentSession.Query<T>().Where(...) and I get the expected results. But if I have 800 thousand documents then I have to use an Expression<Func<T,bool>>, otherwise I get no results.

Why is this not consistent?

The issue is that I have the same predicate used for an in-memory filter as well as a database query filter. For the in-memory filter the collection is IEnumerable<T> so it uses a Func<T,bool> whereas for the database query the collection is an IQueryable<T> so it uses an Expression<Func<T,bool>>. So in my production code I have an overloaded "filter" method - one that takes an IEnumerable<T> and another that takes an IQueryable<T>. Content of the former is: return list.Where(getPredicate(x).Compile()) and content of the latter is: return list.Where(getPredicate(x))

Obviously this just looks like duplicated code and is shouting out: "Please refactor me and reduce the code duplication". But as soon as a developer does that it should break a unit test. However, I can't get a unit test to fail when passing a Func<T,bool>.

Edit: Upon closer inspection, it appears that it has nothing to do with the number of documents. If I connect to the same "production" database in my unit test (calling the actual production code), it returns results when using Func<T,bool> but when I run the application it returns nothing. Very strange!

Shayne
  • 195
  • 1
  • 10
  • 1
    I never used RavenDB, but if you use `Func` I am quite sure it would retrieve all data and run through it at client side. – tia May 01 '14 at 11:59
  • I understand that. What I want to do is write a unit test which breaks if a developer changes it from an `Expression>` to a `Func`. But whatever I do, the unit test still passes, because the test data doesn't have as many documents as the production data. – Shayne May 01 '14 at 12:03
  • 1
    @Shayne well, if the developers change these kind of things without knowing the difference... Change the developers ! – Raphaël Althaus May 01 '14 at 12:08
  • 2
    Or teach them why they are wrong and how to improve. Developers not objects to be thrown away when they make a mistake. Objects cannot learn, humans can. – Sean Airey May 01 '14 at 12:24
  • @Sean Frankly, I can't afford developers making mistakes. That's why I have unit tests. If I can't write a unit test which guarantees that my code will work in production then what is the point of unit testing at all? – Shayne May 01 '14 at 12:27
  • Well that's fair enough, I personally don't like the stance that you just replace people if they make a mistake. Several, after being warned and whatnot, then yes. I also understand that businesses have to look after their cash flow. I just don't think it's helpful to offer that as advice in any situation. I do happen to agree with you on the unit test aspect though, if you don't have 100% test coverage you might as well have not bothered in the first place. How about writing to a log if the incoming type is wrong and then checking that log for this specific entry and then failing the test? – Sean Airey May 01 '14 at 12:47
  • ... Instead of throwing an exception if you really don't want to throw an exception. – Sean Airey May 01 '14 at 12:48
  • 1
    You could use reflection in your unit test, to `Assert` that the types match... – Kris Vandermotten May 01 '14 at 13:11

1 Answers1

3

Well, it's a very bad idea to use a Func<T, bool> when working with IQueryable, because this mean you will retrieve all elements from db (because your Queryable will be enumerated), then the filter will be applied on then IEnumerable<T>.

If you work with an Expression<Func<T, bool>>, the Where clause will be applied on the db level, and only the filtered elements will be retrieved.

Edit: The behaviour is perfectly "correct", and absolutely predictable.

These 2 Where are 2 extension methods.

The one with Func takes an IEnumerable<T> as argument. And IQueryable<T> inherits from IEnumerable<T>. So if you use a Func as argument, this method will be taken, and the Queryable will be enumerated before filtering.

Raphaël Althaus
  • 59,727
  • 6
  • 96
  • 122
  • 1
    So why not throw an exception when passing a Func? This behaviour is unpredictable because it works for small sets of data, but not for large sets. – Shayne May 01 '14 at 12:00
  • I would say the behaviour is 100% predictable. When you use a `Func` parameter you are using the `Enumerable.Where` extension method. When you use `Expression>` you are using the `Queryable.Where` extension method. It could be easy to use a wrong one, but that doesn't make it unpredictable. – Lukazoid May 01 '14 at 12:02
  • @Lukazoid I know that, but the issue is that I have the same predicate used for an in-memory filter as well as a database query filter. For the in-memory filter the collection is `IEnumerable` so it uses a `Func` whereas for the database query the collection is an `IQueryable` so it uses an `Expression>`. So in my production code I have an overloaded "filter" method - one that takes an `IEnumerable` and another that takes an `Expression>`. Content is: `return list.Where(getPredicate(x).Compile())` and `return list.Where(getPredicate(x))` – Shayne May 01 '14 at 12:09