1

I need to group by an UniqueIdentifier column, the table also contains the XML column.

Table schema: StudentMark:

CREATE TABLE [dbo].[StudentMark]
(
    [StudentMarkId] [int] IDENTITY(1,1) NOT NULL,
    [StudentId] [uniqueidentifier] NULL,
    [SubjectId] [uniqueidentifier] NULL,
    [ScoreInfo] [xml] NULL,
    [GeneratedOn] [datetime2](2) NOT NULL,

    CONSTRAINT [PK_StudentMark] 
       PRIMARY KEY CLUSTERED ([StudentMarkId] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Sample seed data

INSERT INTO [dbo].[StudentMark] ([StudentId], [SubjectId], [ScoreInfo], GeneratedOn])
VALUES ('FC3CB475-B480-4129-9190-6DE880E2D581', '0D72F79E-FB48-4D3E-9906-B78A9D105081', '<StudentMarkAttribute xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"></StudentMarkAttribute>', '2017-08-10 10:20:15'),
       ('0F4EF48C-93E3-41AA-8295-F6B0E8D8C3A2', '0D72F79E-FB48-4D3E-9906-B78A9D105081', '<StudentMarkAttribute xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"></StudentMarkAttribute>', '2017-08-10 10:20:15'),
       ('0F4EF48C-93E3-41AA-8295-F6B0E8D8C3A2', 'AB172272-D2E9-49E1-8040-6117BB6743DB', '<StudentMarkAttribute xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"></StudentMarkAttribute>', '2017-08-16 09:06:20'),
       ('FC3CB475-B480-4129-9190-6DE880E2D581', 'AB172272-D2E9-49E1-8040-6117BB6743DB', '<StudentMarkAttribute xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"></StudentMarkAttribute>', '2017-08-16 09:06:20');

Requirement: I need to group by [dbo].[StudentMark].[StudentId] and take the latest record.

I tried the following SQL query but it is causing an error

SELECT 
    MAX([StudentMarkId]), [StudentId], [SubjectId], [ScoreInfo], [GeneratedOn]
FROM 
    [dbo].[StudentMark] 
GROUP BY 
    [StudentId]

Error:

Column 'dbo.StudentMark.SubjectId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

I refereed the following question but I can't fix it: Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

Kindly assist me.

B.Balamanigandan
  • 4,713
  • 11
  • 68
  • 130

3 Answers3

3

Use ROW_NUMBER to calculate position within group:

SELECT *
FROM (
    SELECT *,
      ROW_NUMBER() OVER(PARTITION BY StudentId ORDER BY StudentMarkId DESC) AS rn
    FROM [dbo].[StudentMark]) sub
WHERE sub.rn = 1;
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
1

An alternative solution works best if you have a Students table:

select sm.*
from students s cross apply
     (select top 1 sm.*
      from studentmark sm
      where sm.studentid = s.studentid
      order by sm.generatedon desc
     ) sm;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Could you please confirm, which one is preferable you approach or `Row_Number` approach ? which one is efficient ? – B.Balamanigandan Aug 17 '17 at 10:10
  • @B.Balamanigandan . . . You should try on your data to see which performs better. With an index on `studentmark(studentid, generatedon)`, I would expect this to be slightly faster. – Gordon Linoff Aug 17 '17 at 11:19
0

You cannot group by XML or TEXT columns, you would first need to convert to varchar(max):

SELECT 
    MAX([StudentMarkId]), [StudentId], [SubjectId],
    CONVERT(XML, CONVERT(VARCHAR(MAX), [ScoreInfo])) DetailXML,
    [GeneratedOn]
FROM 
    [dbo].[StudentMark] 
GROUP BY 
    [StudentId], [SubjectId], 
    CONVERT(VARCHAR(MAX), [ScoreInfo]), [GeneratedOn]

On the first line, it is converted to varchar(max) to match the GROUP BY clause, and later it is re-cast back to XML.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Alfaiz Ahmed
  • 1,698
  • 1
  • 11
  • 17