Let's say I have a cars
table. It has following columns: id
, color
, engine
, price
, upholstery
, model_name
. So far I've been querying for all cars based on criteria passed from the front end app. However, now I'm told that if multiple cars have the same value of engine
, model_name
and price
they shouldn't be returned as they are duplicates. In this case, one car should be returned with a number of duplicates instead. So I probably need to do some sort of a group_by
on engine, model_name, price
but I still need to get values of the other (not-grouped) columns (id
, color
, upholstery
) for each record, as well as duplicates count.
My API needs to stay compliant with the old one. Currently it returns:
[
{id: 1, color: 'blue', engine: '2.0', price: 20000, upholstery: 'leather', model_name: 'outback'},
{id: 2, color: 'red', engine: '2.5', price: 20000, upholstery: 'material', model_name: 'wrx'},
{id: 3, color: 'yellow', engine: '2.5', price: 20000, upholstery: 'leather', model_name: 'wrx'},
]
Now it should be:
[
{id: 1, color: 'blue', engine: '2.0', price: 20000, upholstery: 'leather', model_name: 'outback', count: 1},
{id: 2, color: 'red', engine: '2.5', price: 20000, upholstery: 'material', model_name: 'wrx', count: 2},
]
Performance does matter in this case. In reality there's way more columns which are queried and joined from multiple tables.