1

I have a page that loads some high-level statistics. Nothing fancy, just about 5 metrics. There are two particular queries that takes about 5s each to load:

+ SELECT COUNT(*) FROM mybooks WHERE book_id IS NOT NULL
+ SELECT COUNT(*) FROM mybooks WHERE is_media = 1

The table has about 500,000 rows. Both columns are indexed.

This information changes all the time, so I don't think that caching here would work. What are some techniques to use that could speed this up? I was thinking:

  • Create a denormalized stats table that is updated whenever the columns are updated.
  • Load the slow queries via ajax (this doesn't speed it up, but it allows the page to load immediately).

What would be suggested here? The requirement is that the page loads within 1s.

Table structure:

  • id (pk, autoincrementing)
  • book_id (bigint)
  • is_media (boolean)
David542
  • 104,438
  • 178
  • 489
  • 842
  • Personally I would go the AJAX route..easy to implement, solves the problem, doesn't require new tables and avoids possible data inconsistency between `mybooks` and `stats`. – Oli Feb 21 '15 at 22:17
  • Could you include your table structure? – Pattle Feb 21 '15 at 22:21
  • 3
    500,000 rows is not that much. It should not take 5 seconds to scan them. If you update this table frequently, it may mean that the counting query is simply blocked and waiting for a long time. I would probably go with creating a stats table and updating it as the data changes. Alternatively, if you can allow the stats to be a bit out of date, run these counting queries in the background every 10 minutes or so to update the stats table. In a sense, cache the results. – Vladimir Baranov Feb 21 '15 at 22:35
  • which DBMS are you working on? which type of index do you use? – MWiesner Feb 21 '15 at 22:36
  • This question is essentially a duplicate of http://stackoverflow.com/questions/28647653/how-to-speed-up-innodb-count-query which has several suggestions already. Go read it. – Rick James Feb 21 '15 at 23:22
  • 1
    Even if you get the perfect INDEX, it will be 500K rows long, and that index will have to be scanned. Maybe not 5 seconds, but still "too long". – Rick James Feb 21 '15 at 23:23
  • In the question that @RickJames linked the author has a table with 5M rows and his `count` takes 1.5 seconds. And he has quite a lot of inserts and deletes in the table as well. So, in your 500K rows you must be doing a really high number of changes in the table to bring the `count` so slow. Or you have a really slow server. – Vladimir Baranov Feb 22 '15 at 02:58

2 Answers2

1

There are a couple of things you can do to speed up the query.

  1. Run optimize table on your mybooks table

  2. Change your book_id column to be an int unsigned, which allows for 4.2 billions values and takes 4 bytes instead of 8 (bigint), making the table and index more efficient.

Also I'm not sure if this will work but rather than doing count(*) I would just select the column in the where clause. So for example your first query would be SELECT COUNT(book_id) FROM mybooks WHERE book_id IS NOT NULL

Rick James
  • 135,179
  • 13
  • 127
  • 222
Pattle
  • 5,983
  • 8
  • 33
  • 56
  • 2
    `OPTIMIZE TABLE` almost never helps anything. It will do some defragmenting, but that is limited, and transient. `COUNT(book_id)` counts only non-NULL book_ids, so that change is redundant with the WHERE clause. – Rick James Feb 21 '15 at 23:25
1

The stats table is probably the biggest/quickest bang for buck. Assuming you have full control of your MySQL server and don't already have job scheduling in place to take care of this, you could remedy this by using the mysql event scheduler. As Vlad mentioned above, your data will be a bit out of date. Here is a quick example:

Example stats table

CREATE TABLE stats(stat VARCHAR(20) PRIMARY KEY, count BIGINT);

Initialize your values

INSERT INTO stats(stat, count)
VALUES('all_books', 0), ('media_books', 0);

Create your event that updates every 10 minutes

DELIMITER |

CREATE EVENT IF NOT EXISTS updateBookCountsEvent
ON SCHEDULE EVERY 10 MINUTE STARTS NOW()
COMMENT 'Update book counts every 10 minutes'
DO
BEGIN
    UPDATE stats
    SET count = (SELECT COUNT(*) FROM mybooks)
    WHERE stat = 'all_books';
    UPDATE stats
    SET count = (SELECT COUNT(*) FROM mybooks WHERE is_media = 1)
    WHERE stat = 'media_books';
END |

Check to see if it executed

SELECT * FROM mysql.event;

No? Check to see if the event scheduler is enabled

SELECT @@GLOBAL.event_scheduler;

If it is off you'll want to enable it on startup using the param --event-scheduler=ON or setting it in you my.cnf. See this answer or the docs.

Community
  • 1
  • 1
Voidmain
  • 141
  • 3