0

I'm trying to retrieve top 5 marks from school database in databricks deltatable using SQL query. So I wrote following query

select   
    rs.State, rs.Year, rs.CW, rs.Country, rs.SchoolName,
    rs.EducationSystem, rs.MarksS1, rs.MarksS2, rs.MarksS3, rs.MarksS4,
    rs.TotalMarks, rs.group_rank
from
    (select 
         State, Year, Country, SchoolName, EducationSystem,
         MarksS1, MarksS2, MarksS3, MarksS4, TotalMarks,
         row_number() over (partition by State, Year, Country, SchoolName, EducationSystem  
                            order by TotalMarks DESC, MarksS1 ASC) as group_rank
     from 
         nm_combined_historical_data_distinct) rs 
where 
    group_rank < 6

I'm trying to get top 5 students based on marks. But if top 5 students have same marks, I want my output like this

State  year  Country School Name  Education System MarksS1 MarksS2 MarksS3 MarksS4 Total
 AZ     2020   US       XYZ           ABC             95      91      92      95      373
 AZ     2020   US       XYZ           ABC             95      91      92      95      373
 AZ     2020   US       XYZ           ABC             95      91      92      95      373
 AZ     2020   US       XYZ           ABC             95      91      92      95      373
 AZ     2020   US       XYZ           ABC             95      91      92      95      373

But my output is coming in this way

 State  year  Country School Name  Education System MarksS1 MarksS2 MarksS3 MarksS4 Total
  AZ     2020   US       XYZ           ABC             95      91      92      95      373

Can you suggest me how do I get my desired output

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
newinPython
  • 313
  • 1
  • 6
  • 19

1 Answers1

0

Instead of row_number() use rank():

rank() over (partition by State, Year, Country, SchoolName, EducationSystem
             order by TotalMarks DESC, MarksS1 ASC
            ) as group_rank
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786