1

How do I write a LINQ query that would translate to the following SQL?

SELECT u.id, u.username, a.id, MIN(a.created_at) AS firstArticle
FROM users u
INNER JOIN editorial_articles a
ON a.user_id = u.id
GROUP BY u.id
ORDER BY u.id, a.created_at, a.id

Basically, a list of users with their first article.

Everything that I try results in an incorrect group-by clause (too many columns), not enough columns selected, or some other issue.

I've tried a thousand different combinations – why is this so difficult?

Classes:

[Table("users")]
public class User
{
    [Column("id", IsPrimaryKey = true, IsIdentity = true, SkipOnInsert = true, SkipOnUpdate = true)]
    public int Id { get; set; } // int

    [Column("username", CanBeNull = false)]
    public string Username { get; set; } = null!; // varchar(20)

    [Association(ThisKey = nameof(Id), OtherKey = nameof(Article.UserId))]
    public IEnumerable<Article> Articles { get; set; } = null!;
}

[Table("articles")]
public class Article
{
    [Column("id", IsPrimaryKey = true, IsIdentity = true, SkipOnInsert = true, SkipOnUpdate = true)]
    public int Id { get; set; } // int
    
    [Column("user_id")]
    public int UserId { get; set; } // int
    
    [Column("created_at")]
    public DateTime CreatedAt { get; set; } // datetime

    [Association(CanBeNull = false, ThisKey = nameof(UserId), OtherKey = nameof(User.Id))]
    public User User { get; set; } = null!;
}

Update:

The closest that I get with a GROUP BY is:

var result =
    from user in db.Users
    join article in db.Articles
    on user.Id equals article.UserId into articles
    from article in articles
    group article by new { user } into g
    orderby g.Key.user.Id, g.Min(a => a.CreatedAt), g.Min(a => a.Id)
    select new
    {
        User = g.Key,
        FirstArticle = g.Min(a => a.CreatedAt)
    };

I don't like that it puts all of the user fields into the group-by. If you just group by Id, then it's not possible to reference the initial user in the select.

Is there no way to group by ID, but select additional fields?

glen-84
  • 1,778
  • 2
  • 18
  • 30
  • Ca;you add your class structures to your question and show an example of what you have tried? – stuartd May 29 '23 at 13:12
  • I think I understand why this happens now – if you only group by `user.Id`, but select `user.Username` as well, and there are two or more users with the same ID but different usernames (not the case, but LINQ doesn't know), then it can't know which of the usernames to select, and the key is therefore not unique. It can work at the SQL level though, since there is only one username per user ID. – glen-84 May 29 '23 at 15:43
  • This query uses MySQL quirk that filed which is not used in grouping key can be retrieved from result. It is not ANSI stadard and you cannot generate such query via `linq2db` . – Svyatoslav Danyliv May 30 '23 at 15:24
  • @SvyatoslavDanyliv Yep, I finally understand that now. Apparently with PostgreSQL, you only need to include the PK (https://stackoverflow.com/questions/9657695/getting-wrong-values-for-other-columns-when-i-select-maxupdated-date#comment12264846_9657773). – glen-84 May 31 '23 at 08:48
  • I cannot understand how PK can help here. Postgres supports LATERAL JOIN (CROSS APPLY) and this query can be rewritten without Window function. Other solutions are workaround. – Svyatoslav Danyliv May 31 '23 at 08:55
  • It's just interesting that PostgreSQL doesn't require you to list all of the columns, I'm not suggesting that GROUP BY is a better solution. – glen-84 May 31 '23 at 10:46

1 Answers1

2

Use Window Function ROW_NUMBER for such task:

var query = 
    from u in db.Users
    from a in u.Articles
    select new
    {
        User = u,
        FirstArticle = a,
        RN = Sql.Ext.RowNumber().Over()
            .PartitionBy(u.Id)
            .OrderBy(a.CreatedAt)
            .ToValue()
    } into s
    where s.RN == 1
    select new 
    {
        s.User,
        s.FirstArticle
    };
Svyatoslav Danyliv
  • 21,911
  • 3
  • 16
  • 32