1

I am trying implement the dreaded GroupBy+First pattern in EF core 3.1.1 (with Pomelo adapter 3.1.1 on MariaDB), which already has an excellet answer using correlated subqueries. However when using this workaround, the EF refuses to translate DateTime.Date comparison in any form:

var query = from groupKey in _context.Message.Select(x => new { x.MessageType, x.CreatedAt.Date }).Distinct()
    from entity in _context.Message
        .Select(x => new { x.CreatedAt.Date, Value = x })
        .Where(x => x.Value.MessageType == groupKey.MessageType && x.Date == groupKey.Date)
        //   after removing Date comparison, the query executes ^^^^^^^^^^^^^^^^^^^^^^^^^^
        .OrderBy(x => x.Value.CreatedAt)
        .Take(1)
    select entity.Value;

the goal of query above is to select first message of each type in every day and then materialize the result. Client-side evaulation is unfeasible, because there are tens of thousands messages each day. If I remove the x.Date == groupKey.Date part, the query translates and is executed successfully. I have also tried comparing year/month/day separetly but the result was the same - EF throws exception.
How can I change the query so that it works as described?


the exception is pretty much the same as in the workaround's question, the query could not be translated:

System.InvalidOperationException: The LINQ expression '(CROSS APPLY (Projection Mapping: ( SELECT TOP((1)) ((CONVERT((m0.CreatedAt), (date))) AS c), ((m0.Id)), ((m0.Text)), ((m0.CreatedAt)) FROM (Message AS m0) WHERE ((((((CONVERT((m0.CreatedAt), (date))) == (t.c))) || (((Equal((CONVERT((m0.CreatedAt), (date))))) && (Equal((t.c))))))) && (((t.MessageType) == (m0.MessageType)))) ORDER BY ((m0.CreatedAt) ASC) ) AS t0))' could not be translated.

Which is a weird LINQ expression considering the expected translated SQL is quite trivial, the following query is tested and does exactly the same as the LINQ should:

SELECT *
from (select *, 
   ROW_NUMBER() over (partition by MessageType, YEAR(CreatedAt), MONTH(CreatedAt), DAY(CreatedAt) order by CreatedAt desc ) 
       as rn from Message
) as byDay where byDay.rn = 1;
wondra
  • 3,271
  • 3
  • 29
  • 48
  • What is the exception? – Fabio Oct 21 '20 at 08:56
  • @Fabio pretty much same as linked question, one sec i will copy mine – wondra Oct 21 '20 at 08:58
  • There's no `dreaded GroupBy+First pattern` - EF Core is an ORM, not a SQL replacement. ORMs are *not* well-suited for reporting queries like this one. The `dreaded` part is an attempt to replicate the pre-2008 SQL query you'd use when `ROW_NUMBER` wasn't available. The *correct* solution is to use eg `FromRawSQL` and work with those results. The execution plan generated when `ROW_NUMBER` is used is far better than using subqueries – Panagiotis Kanavos Oct 21 '20 at 09:19
  • 1
    BTW the partition clause should be `by messagetype, cast(CreatedAt as date)`. The query engine can convert operations using such casts to range searches that can still use indexes. – Panagiotis Kanavos Oct 21 '20 at 09:23
  • Done properly, your query could be as simple as `db.FromSqlRaw("select *, ROW_NUMBER() .... from Message").Where(msg=>msg.RN==1)`. Or you could use eg `Dapper` to execute the query and map the results without going through EF Core. – Panagiotis Kanavos Oct 21 '20 at 09:25
  • @PanagiotisKanavos thanks for pointing out how to make a better query. As why dreaded? Modern ORM are trying to create illusion they can do everything and easier without the error-prone raw queries, as you can see the API creates an impression that it is as powerful as SQL but give it a tiny innocent change and it explodes in wonderful fireworks. Of-course database expert could do *much* better than ORM, but you dont always have one at hand in which case ORM still comes handy. – wondra Oct 21 '20 at 09:30
  • @wondra modern ORMs don't do that. Developers use them this way to avoid learning SQL. Such queries are as bad and error prone as raw SQL if not worse. The very name says they shouldn't be used that way. ORM means Object to Relational Mapping. They are meant to Map *Objects* to Relational constructs, not generate queries – Panagiotis Kanavos Oct 21 '20 at 09:31
  • @wondra what you describe is embedded SQL, like [SQLJ](https://docs.oracle.com/cd/B28359_01/java.111/b31227/overview.htm#BABJACEE) which has nothing to do with objects (and hence ORMs). A preprocessor or language feature allows [embedding SQL statements in the code and mapping the results](https://docs.oracle.com/cd/B28359_01/java.111/b31227/overview.htm#BABJBDBB). This is completely different from ORMs as SQLJ has no knowledge of the objects apart from mapping the results – Panagiotis Kanavos Oct 21 '20 at 09:38
  • @wondra the closest to this in .NET is F#'s [SqlClient typed provider](https://fsprojects.github.io/FSharp.Data.SqlClient/) which retrieves and exposes type data during design time as well, so the compiler knows in advance if you try to use a non-existent field or an invalid cast – Panagiotis Kanavos Oct 21 '20 at 09:42
  • @PanagiotisKanavos still, I think its best to leave SQL in database - a resonable compromise, I think, is to add a view and query it from C#. There are of-course always posibilities of using different software/environments, but not everything is both production-ready and fits into the project's architecture. – wondra Oct 21 '20 at 10:28
  • What component versions are you using? With latest EFC 3.1.9 and Pomelo 3.2.3 this translates successfully (using `JOIN LATERAL` construct) – Ivan Stoev Oct 21 '20 at 10:44
  • @wondra, maybe this answer can help https://stackoverflow.com/questions/64412492/implementation-of-partition-by-command-by-linq – Svyatoslav Danyliv Oct 21 '20 at 11:40
  • @IvanStoev I was using 3.1.1 (both), but after updating to latest (3.1.9 + 3.2.3) the very same exception is thrown. Are you sure it was the same (type of) query - I have similar GroupBy+First working, but using simplier queries (just one integer-valued key). – wondra Oct 21 '20 at 12:09
  • I've just copied yours. With sample entity `public class Message { public int Id { get; set; } [Required] public string Text { get; set; } public int MessageType { get; set; } public DateTime CreatedAt { get; set; } }` – Ivan Stoev Oct 21 '20 at 12:48
  • 1
    @IvanStoev weird, apart from using database-first (=partial class) and running mariaDB server, I dont see any differences. I guess, I will have to delete the question if you (and probably others) cannot reproduce the issue from this sample. I cannot find any even remotely relevant code, except my config files and app startup - this was the simplest I could reproduce it on. Regardless, thank you for the help. – wondra Oct 21 '20 at 14:23
  • 2
    @wondra MariaDB seems to be the key. I was able to reproduce it by adding `mySqlOptions => mySqlOptions.ServerVersion(new Version(8, 0, 21), ServerType.MariaDb)` to `UseMySql`. If I omit it (as I did initially) or specify `ServerType.MySql`, it works. – Ivan Stoev Oct 21 '20 at 15:50

0 Answers0