This happens just in prod environment and I could not reproduce same behaviour in a sandbox.
#0 0x00007f07854d8cb9 in __GI___poll (fds=0x7f076d8f1e88, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1 0x00007f077c2ad990 in ?? () from /usr/lib/x86_64-linux-gnu/libpq.so.5
#2 0x00007f077c2adaf8 in ?? () from /usr/lib/x86_64-linux-gnu/libpq.so.5
#3 0x00007f077c2aa2d9 in PQgetResult () from /usr/lib/x86_64-linux-gnu/libpq.so.5
#4 0x00007f077c5176b3 in ?? () from /usr/lib/x86_64-linux-gnu/odbc/psqlodbcw.so
#5 0x00007f077c4fa644 in ?? () from /usr/lib/x86_64-linux-gnu/odbc/psqlodbcw.so
#6 0x00007f077c4fbab5 in ?? () from /usr/lib/x86_64-linux-gnu/odbc/psqlodbcw.so
#7 0x00007f077c53242a in ?? () from /usr/lib/x86_64-linux-gnu/odbc/psqlodbcw.so
#8 0x00007f077c50de7a in ?? () from /usr/lib/x86_64-linux-gnu/odbc/psqlodbcw.so
#9 0x00007f077c50f3a7 in ?? () from /usr/lib/x86_64-linux-gnu/odbc/psqlodbcw.so
#10 0x00007f077c50fac1 in ?? () from /usr/lib/x86_64-linux-gnu/odbc/psqlodbcw.so
#11 0x00007f077c5395bb in SQLExecDirect () from /usr/lib/x86_64-linux-gnu/odbc/psqlodbcw.so
#12 0x00007f0787a5a16f in SQLExecDirect () from /usr/lib/x86_64-linux-gnu/libodbc.so.2
#13 0x00000000004be434 in my_function_where_i_call_sql_exec_direct ()
We use managed Postgres in a cloud similar to AWS, this service provides web interface where I have an access to server logs. The most strange thing to me that I see my query successfully executed:
{
"transaction_id": "0",
"error_severity": "LOG",
"hostname": "rc1a-...mdb.yandexcloud.net",
"internal_query": "",
"process_id": "2624784",
"session_id": "60672f13.280d10",
"query_pos": "0",
"user_name": "ace",
"application_name": "ace-realm-meta - 10.128.0.9 ",
"command_tag": "idle in transaction",
"context": "",
"hint": "",
"message": "statement: RELEASE _EXEC_SVP_0x319d9f0;SAVEPOINT _EXEC_SVP_0x319d9f0;SELECT R_Object.meta FROM R_Object WHERE ((R_Object.id \\= 15) ) ORDER BY R_Object.id ASC ",
"session_line_num": "2817",
"connection_from": "localhost:48058",
"database_name": "droblozhko1",
"internal_query_pos": "0",
"query": "",
"virtual_transaction_id": "59/191031",
"detail": "",
"location": "",
"session_start_time": "2021-04-02T17:49:55+03:00",
"sql_state_code": "00000"
}
(which means no exclusive locks are blocking my select?!)
I have googled a lot and found just one discussion about similar story: https://postgrespro.com/list/thread-id/2394695, but SQL_ATTR_QUERY_TIMEOUT nor keepliave pqoptions mentioned in this thread are affecting anything :\
Please, share any thoughts because I'm completely run out of ideas.
Update1. Why timeout is negative(in a last frame)? This is why poll() never returns.
Update2 I found out that PQgetResult() calls for pqWait() which calls pqWaitTimed with hardcoded finish_time to -1. The question now is how this logic supposed to deal with network issues?