0

I have a table with large amount of data(Say 1,000,000,000 rows).

Table Structure:

Id(Primary Key)
Name
...

I have omitted other fields from the table as I cannot use any of those to limit the data I fetch.

Here primary key is Id. I do not have any index on, only the name column.

I need to find a list of Names which occur more than n number of times(Say n=10).

I have tried the below options:

SELECT /+full(T)/ Name,COUNT(Id) AS CNT FROM T GROUP BY Name HAVING COUNT(ID) >10;

select distinct Name, COUNT(ID) OVER (PARTITION BY Name) AS CNT FROM T where CNT>10;

Both of them are taking large temp space.

Any other suggestions to optimize the query to use less temp space. Speed is not the primary concern here for me. It is fine even if the query takes around 2 hours.

kayess
  • 3,384
  • 9
  • 28
  • 45

1 Answers1

0

WITH names AS ( select Name, COUNT(ID) OVER (PARTITION BY Name) AS CNT FROM T) select DISTINCT name from names where names.CNT > 10

ArtBajji
  • 949
  • 6
  • 14
  • This is similar to the second query I have posted. It is not reducing the amount of temp space required. – Minnu perinchery Apr 26 '16 at 09:34
  • WHERE clause executes before analytic functions. So the second query in the question will not work. Hence I have added it into a WITH clause. If that is also slower, can you try this? WITH names AS ( select Name, COUNT(ID) OVER (PARTITION BY Name) AS CNT FROM T ORDER BY 2 DESC) select name from names where names.CNT > 10 – ArtBajji Apr 26 '16 at 10:00