I have two columns, one is a string field customer
containing customer names and the other is a numeric field sales
representing sales.
What I want to do is to group data by customer and then sort sales within group.
In SQL or Pandas, this is normally achieved by something like order by customer, sales
on the table. But I am just curious about this implementation. Instead first sorting on customer
and then sorting on sales
, why not first group customer
and sort sales
. I don't really care about the order of the different customers since I only care about records of same customers being grouped together.
Grouping is essentially mapping and should run faster than sorting.
Why isn't there such implementation in SQL? Am I missing something?
Example data
name,sales
john,1
Amy,1
john,2
Amy,3
Amy,4
and I want it to group by name and then sort by sales:
name,sales
john,1
john,2
Amy,1
Amy,3
Amy,4
In SQL you probably would do select * from table order by name,sales
This would definitely do the job. But my confusion is since I don't care about the order of name, I should be able to do some kind of grouping (which should be cheaper than sorting) first and do sorting only on the numeric field. Am I able to do this? Why do a lot of examples from google simply uses sorting on the two fields? Thanks!