Estimate row count SQL

Question

I have more than 10 million rows. Counting the number of rows is too slow. I want to estimate the number of rows.

Current query is like this:

SELECT count(*) FROM `database` WHERE `location` LIKE '%New York%'

This gives me the exact number and its too slow. What I want is something like this:

ESTIMATE count(*) FROM `database` WHERE `location` LIKE '%New York%'

Maybe this can be done by limiting the query to a subset of the table (every nth row or something). Maybe there is a built-in function that can be used.

Edit: Perhaps I can limit the query to every nth row? and then multiple by n.

Edit2: I just tried counting every 10th row and then multiplying the count by ten, but there was no increase in speed, despite the fact that the auto-incrementing 'counter' is a primary key.

SELECT count(*) FROM database WHERE `location` LIKE '%Los Angeles%' and counter%10 = 0

there isn't such thing. you can take a look at https://stackoverflow.com/questions/4400451/optimize-mysql-count-query but if you have a auto_incrm3ent why not use the max as estimate — nbk, Mar 27 '22 at 22:50
Right, I can simply add "WHERE COUNTER > 100 and COUNTER < 200" to get a sample of 100 rows. The problem with that is that I would be focusing on a particular part of the table which may not be representative. Is there a way to get every nth row? — Ned Hulton, Mar 27 '22 at 22:59
If that is what you really need on a regular basis, run a counter with after triggers, it costs some time but you have quick an answer, count are always bad, and you must consider if you needed it — nbk, Mar 27 '22 at 23:05

score 3 · Accepted Answer · answered Mar 27 '22 at 23:41

3

The performance of COUNT(*) is not the biggest performance problem in this case. The performance cannot be improved, because your search condition LIKE '%...%' is bound to run a table-scan. It must examine every row in the table anyway, to check if it matches your condition.

Alternatives to make it quicker:

Match the pattern using fulltext indexing instead of LIKE predicates with wildcards
Store the count and keep it updated as you add or change data, as the comment above suggests (but you'd need a separate counter for each possible search pattern)
Get a more powerful computer
Store less data

answered Mar 27 '22 at 23:41

Bill Karwin

538,548
86
673
828

In some RDBMS products creating an index causes the system to keep/find number of records from the index structure very easily; would MySql have something like that (of course, after removing LIKE '%..') – tinazmu Mar 28 '22 at 00:47
1

Yes, values in an index are persistent, but MySQL does not have indexes that can store aggregated calculations like `COUNT(*)`. – Bill Karwin Mar 28 '22 at 01:52
1

MySQL's outdated MyISAM storage engine keeps a record of the total number of rows in the table, which makes `SELECT COUNT(*)` fast, _provided your query have no condition in the WHERE clause._ But InnoDB (the current default storage engine) does not support this, because the count of rows in a table varies due to transactions. – Bill Karwin Mar 28 '22 at 01:54

score 1 · Answer 2 · answered Mar 28 '22 at 03:39

This would run immensely faster if you have INDEX(city):

WHERE `city` = New York'

This would run nearly as fast as that:

WHERE `city` = New York%'

If you currently have location as a combination of address, city, state, zip, then you may need to rethink the choice of having location in a single column. Or go with a FULLTEXT index, as Karwin suggests.

Estimate row count SQL

2 Answers2