5

I have a hyper table for exchange candle data set up using TimescaleDB.

  • TimescaleDB official image timescale/timescaledb:latest-pg12 set up and running with Docker with the exact version string starting PostgreSQL 12.6 on x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203, 64-bit

  • Python 3 client

  • The table has 5 continuous aggregate views set up like here and around 15 colums

Running the following query is slow (count query generated with SQLAlchemy):

SELECT count(*) AS count_1
FROM (SELECT my_hypertable.timestamp AS my_hypertable_timestamp
FROM my_hypertable) AS anon_1

Getting a count for a hypertable with 14,000 rows takes 7 seconds on a beefy server, measured from Python script. Whereas for a normal table the query time seems to be sub 10 ms. This makes the COUNT(*) on hypertable unusable in most of the situations.

  • Is there something in TimescaleDB architecture that makes COUNT(*) especially slow?

  • If there is something wrong with my set-up, any pointers or guesses what it could be or how to diagnose this further

  • If the count() is going to be super slow for architectural reasons, what are the recommendations to work around this - or should it be avoided altogether?

Mikko Ohtamaa
  • 82,057
  • 50
  • 264
  • 435

1 Answers1

7

you can try the approximate_row_count() function (https://docs.timescale.com/api/latest/analytics/approximate_row_count/) which gives an immediate result.

  • Thank you so much Benoit! Is there any particular reason why the normal `COUNT` is slow, so I could understand the hypertable limitations better? – Mikko Ohtamaa Jun 13 '21 at 17:25