5

I am using Amazon RDS service to host a PostreSql which serves as a database for my Java application. After the application starts, it is able to execute queries as expected until I stop interacting for some minutes and try to execute any query again. In that scenario, I get the following exception:

WARNING: Validating connection.
org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:327)
    at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
    at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
    at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
    at org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
    at org.apache.commons.dbcp2.DelegatingConnection.isValid(DelegatingConnection.java:897)
    at org.apache.commons.dbcp2.PoolableConnection.validate(PoolableConnection.java:270)
    at org.apache.commons.dbcp2.PoolableConnectionFactory.validateConnection(PoolableConnectionFactory.java:630)
    at org.apache.commons.dbcp2.PoolableConnectionFactory.validateObject(PoolableConnectionFactory.java:648)
    at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:472)
    at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:349)
    at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
    at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:753)

Caused by: java.net.SocketException: Operation timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:170)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140)
    at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:109)
    at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:67)
    at org.postgresql.core.PGStream.receiveChar(PGStream.java:288)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1962)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:300)

On Amazon RDS PostgreSQL I see the following errors:

2020-04-09 19:01:11 UTC::[]:LOG: could not receive data from client: Connection timed out
2020-04-09 19:04:27 UTC::@:[]:LOG: checkpoint starting: time
2020-04-09 19:04:28 UTC::@:[]:LOG: checkpoint complete: wrote 1 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.143 s, sync=0.001 s, total=0.154 s; sync files=1, longest=0.001 s, average=0.001 s; distance=16377 kB, estimate=16396 kB
2020-04-09 19:08:15 UTC::LOG: could not receive data from client: Connection timed out

Any idea of how to solve that issue?

JuniorStack2
  • 113
  • 1
  • 2
  • 8
  • Check if there is a firewall that drops idle connections. – Laurenz Albe Apr 10 '20 at 04:50
  • I am using a MAC for development and the firewall is turned off. – JuniorStack2 Apr 11 '20 at 13:19
  • Try executing netcat: e.g. `nc myinstance.123456789012.us-east-1.rds.amazonaws.com 5432 -v -v -z` for checking whether your workstation can connect to the database. Do you get `succeeded!` ? – Marinos An Apr 14 '20 at 10:55
  • Hi, what framework are you using? what jdbc driver, pool? Which version. Can you post any configuration regarding database, jdbc? Probably Apache Commons DBCP, version? – RenatoIvancic Apr 14 '20 at 14:29
  • Have you investigated AWS PostgreSQL logs, system logs? –  Apr 15 '20 at 13:52
  • Any chance you are using Docker Swarm or older versions of Kubernetes to create overlay networks? – JohannesB Apr 15 '20 at 17:45

3 Answers3

2

I guess you have a (virtual)network component like a router, loadbalancer, stateful firewall, etc. that decides that after X seconds your connection can be removed saving some of its precious memory. This however without informing both parties of the TCP connection (the database nor your Java application) so after some time they will notice that they are no longer connected to each other.

If you cannot change the behaviour of your network components that silently kill your connections you are left to choose from these options:

  1. Configure your Connection Pool (DBCP, or if you dislike it, switch to HikariCP) to actively close idle connections, before above timeout, see for instructions this ancient thread
  2. Configure your Connection Pool to keep checking the health of idle connections, which will keep the connection alive
  3. Use TCP features from the operating system (TCP KeepAlive) by:

    • Enabling this feature on the socket via the JDBC driver with: tcpKeepAlive="true"
    • Tuning the operating systems keepalive time settings to transmit keepalive packets before the timeout, see: Linux docs or these Windows Registry settings
  4. Don't use a connection pool (but that is bad for performance, see: this answer for more details )

JohannesB
  • 2,214
  • 1
  • 11
  • 18
1

I've faced this problem once before and it was driving me crazy. Eventually the problem was because the firewall stop any idle connection for more than 30 minutes without notifing the backend

Elarbi Mohamed Aymen
  • 1,617
  • 2
  • 14
  • 26
0

I see a potential problem here, you might be running out of connections in your connection pool to your postgres DB, this might be because the existing connection might not close after your DB transaction and is not being re-used by the new incoming transaction. Try using pgbouncer to resolve this. This my guess seeing your logs

user9484528
  • 337
  • 2
  • 12