Sorry. This is a rewritten question because there was no answer.
I changed it from rank() to row_number() because the same rank is not needed.
I'm currently working on a query, but I'm not getting the speed so I'm asking a question.
What I want to find is 'count the most names in each category by DataID'.
Data_Category { id : INT PRIMARY KEY AUTO_INCREMENT, DataId: INT, name: varchar2(200), category: varchar2(200) }
and it is possible to input in duplicate.
select dataId, name, category, count(*) as cnt
, row_number() over (partition by dataId, category order by count(*) desc) as "ranking"
from data_category
group by dataId, name, category
In the above query, there is a difference in speed with and without the rank part.
If there is no row_number, it takes 0.0053, and if there is, it takes 0.5 seconds. The result is about 260,000 rows.
select *, row_number() over (partition by A.dataId, A.category order by cnt desc) as "ranking"
from(
select dataId, name, category, count(*) as cnt
from data_category
group by dataId, name, category
) A
The above query also yields almost the same speed. The moment it is wrapped in a select statement, it is slowing down due to the temporary table creation.
The data is about 400,000 cases.
The index is (dataId, category, name) .
I want to extract 5 names with the highest number of names in each category by data ID, but I can't figure out the direction of the query.
And I will create 'view' using this query. So there are no additional where clause. After that, I want to use it by joining it with another table.
Even if you don't write a query, I sincerely ask for a simple hint..!
Output:
dataId | category | name | rank |
---|---|---|---|
1 | cate1 | name1_1 | 1 |
1 | cate1 | name1_2 | 2 |
1 | cate1 | name1_5 | 3 |
1 | cate1 | name1_3 | 4 |
1 | cate2 | name2_1 | 1 |
1 | cate2 | name2_5 | 2 |
1 | cate2 | name2_3 | 3 |
2 | cate1 | name3_1 | 1 |
2 | cate3 | name3_9 | 1 |
Thank you.
Explain
#Query 1-2 : Without rank_row
id | select_type | table | type | key | rows | filtered | extra |
---|---|---|---|---|---|---|---|
1 | SIMPLE | data_cateogory | index | IDX_39ea6497ea9ca40d9e783e | 432344 | 100.00 | Using index; |
#Query 1-1 : With rank_row
id | select_type | table | type | key | rows | filtered | extra |
---|---|---|---|---|---|---|---|
1 | SIMPLE | data_cateogory | index | IDX_39ea6497ea9ca40d9e783e | 432344 | 100.00 | Using index; Using temporary; Using filesort |