1

Having following potential infinite-time execution query. It does not make sense why it had been issued to Clickhouse server. Query is already has been launched and still running:

SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"

Okay, try to find associated query_id or already have one. For instance, query_id = 'd02f4bdb-8928-4347-8641-4da4b9c0f486'. Let's kill it via following query:

KILL QUERY WHERE query_id = 'd02f4bdb-8928-4347-8641-4da4b9c0f486' 

Achieved kill-query result seems to be okay from first look:

┌─kill_status─┬─query_id─────────────────────────────┬─user────┬─query────────────────────────────────────────────────────────────────────────┐
│ waiting     │ d02f4bdb-8928-4347-8641-4da4b9c0f486 │ default │ SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"; │
└─────────────┴──────────────────────────────────────┴─────────┴──────────────────────────────────────────────────────────────────────────────┘

Okay, let's wait for several seconds and ensure that original query had been terminated successfully. Let's check it via following system information schema query:

SELECT "query_id", "query", "is_cancelled" FROM system.processes WHERE query_id = 'd02f4bdb-8928-4347-8641-4da4b9c0f486';

Unfortunately original query is still running in a some sense. It turned into "is_cancelled" state and still hangs:

┌─query_id─────────────────────────────┬─query────────────────────────────────────────────────────────────────────────┬─is_cancelled─┐
│ d02f4bdb-8928-4347-8641-4da4b9c0f486 │ SELECT Count("SN".*) FROM (SELECT sleepEachRow(3) FROM system.numbers) "SN"; │            1 │
└──────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────┴──────────────┘

Waiting for hour and more time and still getting some results. Original query is still hanged in "is_cancelled" state. Subsequent KILL queries with same query_id does not do nothing.

Most likely, restarting the server will help solve the problem, but I do not want to do this. How to solve the problem with a stuck query without server restarting?

Vladislav Ihost
  • 2,127
  • 11
  • 26
  • 1
    https://kb.altinity.com/altinity-kb-queries-and-syntax/altinity-kb-kill-query/ – Denny Crane May 24 '22 at 19:52
  • @DennyCrane Hmm, seems strange. Moreover, there are comment in linked issue in your KB link above, referring that problem is fixed in fresh CH versions https://github.com/ClickHouse/ClickHouse/issues/1576#issuecomment-1106432528 – Vladislav Ihost May 25 '22 at 07:50
  • However, query like this `SELECT DISTINCT ("number"*0) AS "Defunct", sleep(3) FROM system.numbers_mt` is interruptible and does produce exact one result row – Vladislav Ihost May 25 '22 at 07:51

1 Answers1

3

ClickHouse queries can't be killed during the sleep.

If you are using a recent CH release (21.12+), then the KILL flag will be checked after each block is processed (on older releases it might never be checked). Since the default block is 65536, the query will be slept for 65536 * 3 seconds ~= 54 hours before checking anything.

In future releases of CH it will be impossible to sleep for more than 3 seconds (which right now is a limit of sleep but not for sleepEachRow). In the meantime you can either wait or restart the server.

Algunenano
  • 124
  • 4
  • Okay, thanks! I found that query like `SELECT DISTINCT ("number"*0) AS "Defunct", sleep(3) FROM system.numbers_mt` can be killed. So just it's necessary to disable issuing queries to CH server with Scalar Subquery. – Vladislav Ihost May 26 '22 at 12:02
  • Also, it's not clear why CH server does not halt queries when TCP/HTTP connection torn? Anyway there is no network socket to respond to, so why not to kill these connection and appropriate queries forcefully in automated manner? – Vladislav Ihost May 26 '22 at 12:03