0

What would be a performant approach for iterating a subset of records in a table with EF 6 in C#?

For .Take() and .Skip() I need to order the result first which I don't think is an option on large subsets.

Basically what I'm trying to do is iterate through every record from a .where query with only loading each record one after another.

A fast way of:

using (Context context = new Context())
{
    var parent = context.MetadataImportSets.Single(x => x.Id == 1);
    var subset = context.Child.Where(x => x.Parent.Id == parent.Id);

    foreach (var record in subset)
    {

    }
}
EpicKip
  • 4,015
  • 1
  • 20
  • 37
F.H.
  • 1,456
  • 1
  • 20
  • 34
  • 2
    *I need to order the result first which i don't think is an option on large subsets* why? Just use an `order by` clause in your SQL. SQL should order the result set pretty fast. If it's really big you may need to add some indexes to keep it fast or better yet, order by the primary key. – Liam May 09 '17 at 08:42
  • Possible duplicate of [Paging in Entity Framework](http://stackoverflow.com/questions/1049825/paging-in-entity-framework) – Liam May 09 '17 at 08:44
  • *Why* are you trying to load children separately from the parent? The `parent` class should have a `Children` property. Let EF do its job and load the children, don't try to duplicate its job, or make it work as if it were a replacement for SQL – Panagiotis Kanavos May 09 '17 at 08:45
  • @PanagiotisKanavos Using include to load it into memory is not an option because it could have > 1'000'000 child-entities. – F.H. May 09 '17 at 08:55
  • @Liam even for large tables holding more than 1'000'000 records? – F.H. May 09 '17 at 08:57
  • 2
    @F.H. then use the appropriate query, or just *don't* use `Include()`. `Include` doesn't "load it in memory", it means that the children will be eagerly loaded. You can load what you want lazily, or you can write an appropriate query to return what you want – Panagiotis Kanavos May 09 '17 at 09:04
  • 1
    @F.H. on the other hand, if you have a lot of records, why are you using an ORM at all? ORMs are meant to work with object graphs, not for batch processing or reporting. What is the *actual* problem you want to solve? Why not use SQL statements, or an ETL tool, or a report? – Panagiotis Kanavos May 09 '17 at 09:06
  • 1,000,000 records isn't that big. With the correct optimisations and a decent SQL box I'd expect the **ordering** to take milliseconds (or less). Iterating those using C# on the other hand is much more expensive. – Liam May 09 '17 at 09:20
  • @PanagiotisKanavos "then use the appropriate query, or just don't use Include()" exactly my question. Let me clarify the question: what is the fastest way to iterate large sets of database records **with EF** – F.H. May 09 '17 at 10:54

0 Answers0