2

I've realised that I don't fully understand the Include method in LINQ-to-Entities.

For example, take the two code snippets below. I would expect them to produce the same output (though the first version may be more efficient because it avoids JOINs).

// Snippet 1
using (var db = new Db()) {
  var author = db.Authors.First();
  db.LoadProperty<Author>(author, o => o.Books);
  foreach (var book in author.Books) {
    db.LoadProperty<Book>(book, o => o.Editions);
    foreach (var edition in book.Editions)
      Response.Write(edition.Id + " - " + edition.Title + "<br />");
  }
}

Response.Write("<br />");

// Snippet 2
using (var db = new Db()) {
  var author = db.Authors.Include("Books.Editions").First();
  foreach (var book in author.Books) {
    foreach (var edition in book.Editions)
      Response.Write(edition.Id + " - " + edition.Title + "<br />");
  }
}

But the output for each snippet is different:

1 - Some Book First Edition
2 - Another Book First Edition
3 - Another Book Second Edition
4 - Another Book Third Edition

8 - Some Book First Edition
9 - Another Book First Edition

The first snippet correctly outputs {Edition Id} - {Edition Title}, whereas the second one unexpectedly prints {Book Id} - {Edition Title} and only gives the first edition of each book.

What's going on? And is there a way to achieve the desired output using Include?

EDIT 1: The MySql data looks like (corrected):

Authors         = { { Id = 1, Name = "Some Author" } }

Books           = { { Id = 8, AuthorId = 1 },
                    { Id = 9, AuthorId = 1 } }

Editions        = { { Id = 1, Title = "Some Book First Edition" },
                    { Id = 2, Title = "Another Book First Edition" },
                    { Id = 3, Title = "Another Book Second Edition" },
                    { Id = 4, Title = "Another Book Third Edition" } }

EditionsInBooks = { { BookId = 8, EditionId = 1 },
                    { BookId = 9, EditionId = 2 },
                    { BookId = 9, EditionId = 3 },
                    { BookId = 9, EditionId = 4 } }

Note that there is no Edition with Id = 8 or Id = 9.

And the code above is my complete code, in Page_Load for an empty test page.

EDIT 2: I've tested the following and they don't make a difference:

  1. var author = db.Authors.Include("Books.Editions").AsEnumerable().First();
  2. var author = db.Authors.Include("Books.Editions").Single(o => o.Id == 1);
  3. var author = db.Authors.Include("Books").Include("Books.Editions").First();

EDIT 3: If I enable Lazy Loading, the following works (in Snippet 2):

var author = db.Authors.First();

(This is essentially doing the same as Snippet 1, I suppose.)

However, this still returns the strange output irrespective of Lazy Loading:

var author = db.Authors.Include("Books.Editions").First();

EDIT 4: I'm really sorry, but I misrepresented the table structure above. (I'm having one of those days.) It's now corrected, to show the many-to-many relationship. Please see Edit 1.

Also the output for

((ObjectQuery)db.Authors.Include("Books.Editions").AsEnumerable())
  .ToTraceString()

is

SELECT
  `Project1`.`Id`,
  `Project1`.`Name`,
  `Project1`.`C2` AS `C1`,
  `Project1`.`id1`,
  `Project1`.`AuthorId`,
  `Project1`.`C1` AS `C2`,
  `Project1`.`id2`,
  `Project1`.`Title`
FROM (SELECT
  `Extent1`.`Id`,
  `Extent1`.`Name`,
  `Join2`.`Id` AS `id1`,
  `Join2`.`AuthorId`,
  `Join2`.`Id` AS `id2`,
  `Join2`.`Title`,
   CASE WHEN (`Join2`.`Id` IS NULL) THEN (NULL)
        WHEN (`Join2`.`BookId` IS NULL) THEN (NULL)
        ELSE (1) END AS `C1`,
   CASE WHEN (`Join2`.`Id` IS NULL) THEN (NULL)
        ELSE (1) END AS `C2`
   FROM `authors` AS `Extent1`
     LEFT OUTER JOIN (SELECT
       `Extent2`.`Id`,
       `Extent2`.`AuthorId`,
       `Join1`.`BookId`,
       `Join1`.`EditionId`,
       `Join1`.`Id` AS `Id1`,
       `Join1`.`Title`
       FROM `books` AS `Extent2`
       LEFT OUTER JOIN (SELECT
         `Extent3`.`BookId`,
         `Extent3`.`EditionId`,
         `Extent4`.`Id`,
         `Extent4`.`Title`
         FROM `editionsinbooks` AS `Extent3`
         INNER JOIN `editions` AS `Extent4`
           ON `Extent4`.`Id` = `Extent3`.`EditionId`) AS `Join1`
       ON `Extent2`.`Id` = `Join1`.`BookId`) AS `Join2`
     ON `Extent1`.`Id` = `Join2`.`AuthorId`) AS `Project1`
   ORDER BY
     `Project1`.`Id` ASC,
     `Project1`.`C2` ASC,
     `Project1`.`id1` ASC,
     `Project1`.`C1` ASC

The CASE statements are interesting, given that none of my MySql fields are nullable.

CassOnMars
  • 6,153
  • 2
  • 32
  • 47
James
  • 7,343
  • 9
  • 46
  • 82
  • 2
    Could you show the entire content of the authors, books, and editions tables? I'm guessing you have more data in there than meets the eye, potentially even duplicate info, and the include statement is reordering it, making the `First()` return a different author. – CassOnMars Oct 18 '11 at 17:11
  • I have to concur with @d_r_w here. Can you try outputting the Author as well as the books? – John Gietzen Oct 18 '11 at 17:13
  • @d_r_w @John See my edit. Also, if I output `edition.Id + " - " + book.Id + " - " + author.Id` it gets the book ID and author ID correct, but the edition ID is output as the same as the book ID. – James Oct 18 '11 at 17:26
  • Have you tried adding .Include("Books") as well as the current include? – meandmycode Oct 18 '11 at 17:46
  • Good idea, just tried it but no luck. See my edit. – James Oct 18 '11 at 17:53
  • Just because your columns aren't nullable, doesn't mean that a result column won't contains nulls. For example, a left join can leave the entire right table as null if there are no matching rows. – John Gietzen Oct 18 '11 at 19:50
  • What provider are you using to get MySQL to work with EF? It might be a bug in your provider. It shouldn't be generating a query like that. – CassOnMars Oct 18 '11 at 20:02
  • @d_r_w: `provider=MySql.Data.MySqlClient` (version 6.4.3.0) – James Oct 18 '11 at 20:08
  • 1
    That was what I needed. It appears that the MySQL provider for Entity Framework is rife with bugs, especially when concerned with joins. I'm not very familiar with MySQL and EF, or any of its shortcomings, but it appears if you want to guarantee that your properties are being navigated properly and you're using MySQL, you're gonna have to do it the first snippet's way. – CassOnMars Oct 18 '11 at 20:13

1 Answers1

1

There might be a bug in Entity Framework's provider's compiling of the First() expression in the LINQ statement. There's occasional weirdness when Include is involved: http://wildermuth.com/2008/12/28/Caution_when_Eager_Loading_in_the_Entity_Framework

Try rewriting the second snippet to be:

using (var db = new Db()) {
    var author = db.Authors.Include("Books.Editions").AsEnumerable().First();
    foreach (var book in author.Books)
    {
        foreach (var edition in book.Editions)
        {
            Response.Write(edition.Id + " - " + edition.Title + "<br />");
        }
    }
}

If that fixes your output, then it's definitely the First() method.

EDIT: You are correct about your third edit doing the same thing as snippet 1. I can't fathom how that include statement is tripping things up so badly. The only thing I could encourage is to look at the SQL query it's generating:

var sql = ((System.Data.Objects.ObjectQuery)db.Authors.Include("Books.Editions").AsEnumerable().First()).ToTraceString();

EDIT 2: It may very well be possible that the problem is in your MySQL provider for EF, given the crazy sql output generated.

CassOnMars
  • 6,153
  • 2
  • 32
  • 47
  • Thanks for your suggestion, but I'm afraid the output is the same. It's also the same if I replace `First()` with `Single(o => o.Id == 1)`. – James Oct 18 '11 at 17:40
  • @James could you look at my edit and tell us what the output of that statement is? – CassOnMars Oct 18 '11 at 18:42
  • See my edit. I also specified my table structure wrong, I forgot there was a many-to-many relationship. It's now corrected. Sorry. :/ – James Oct 18 '11 at 19:33