0

The trouble with constantly running jobs on postgre DB, which will never finished.

I have tried the following actions to fix it:

  • apt-get update & upgrade(postgresql was updated to latest)
  • /etc/init.d/postgresql restart

    postgresql.service - PostgreSQL RDBMS
       Loaded: loaded (/lib/systemd/system/postgresql.service; enabled)
       Active: active (exited) since Fri 2015-10-16 10:06:04 UTC; 9s ago
       Process: 6787 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
       Main PID: 6787 (code=exited, status=0/SUCCESS)
         CGroup: /system.slice/postgresql.service
    
  • /etc/init.d/pgagent restart

    pgagent.service - Postgres Job Agent Daemon
      Loaded: loaded (/lib/systemd/system/pgagent.service; enabled)
      Active: active (running) since Fri 2015-10-16 10:06:04 UTC; 1min 48s ago
      Main PID: 6793 (pgagent)
         CGroup: /system.slice/pgagent.service
           └─6793 /usr/bin/pgagent -f -l 2 -s /var/log/pgagent hostaddr=localhost dbname=postgres user=postgresext
    
    Oct 16 10:07:00 m-t-db-01 pgagent[6793]: *** Caught unhandled unknown exception; terminating
    Oct 16 10:07:50 m-t-db-01 pgagent[6793]: *** Caught unhandled unknown exception; terminating
    
  • tried to enable debug mode on pgagent vim /etc/default/pgagent

    EXTRA_OPTS="-f -l 2 -s /var/log/pgagent hostaddr=localhost dbname=postgres user=postgresext"
    
  • tried to reboot the machine
  • in /var/log/pgagent log I see only:

    ERROR: Failed to query jobs table!
    DEBUG: Creating primary connection
    DEBUG: Connection Information:
    DEBUG:      user         : postgresext
    DEBUG:      port         : 0
    DEBUG:      host         : localhost
    DEBUG:      dbname       : postgres
    DEBUG:      password     :
    DEBUG:      conn timeout : 0
    DEBUG: Connection Information:
    DEBUG:      user         : postgresext
    DEBUG:      port         : 0
    DEBUG:      host         : localhost
    DEBUG:      dbname       : postgres
    DEBUG:      password     :
    DEBUG:      conn timeout : 0
    DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
    DEBUG: Database sanity check
    DEBUG: Clearing zombies
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Clearing inactive connections
    DEBUG: Connection stats: total - 1, free - 0, deleted - 0
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Creating primary connection
    DEBUG: Connection Information:
    DEBUG:      user         : postgresext
    DEBUG:      port         : 0
    DEBUG:      host         : localhost
    DEBUG:      dbname       : postgres
    DEBUG:      password     :
    DEBUG:      conn timeout : 0
    DEBUG: Connection Information:
    DEBUG:      user         : postgresext
    DEBUG:      port         : 0
    DEBUG:      host         : localhost
    DEBUG:      dbname       : postgres
    DEBUG:      password     :
    DEBUG:      conn timeout : 0
    DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
    DEBUG: Database sanity check
    DEBUG: Clearing zombies
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Clearing inactive connections
    DEBUG: Connection stats: total - 1, free - 0, deleted - 0
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Clearing inactive connections
    DEBUG: Connection stats: total - 1, free - 0, deleted - 0
    DEBUG: Checking for jobs to run
    DEBUG: Creating job thread for job 8
    DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
    DEBUG: Allocating new connection to database postgres
    DEBUG: Starting job: 8
    DEBUG: Creating job thread for job 5
    DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
    DEBUG: Allocating new connection to database postgres
    DEBUG: Starting job: 5
    DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres dbname=testdb
    DEBUG: Sleeping...
    DEBUG: Allocating new connection to database testdb
    DEBUG: Executing SQL step 23 (part of job 8)
    DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres dbname=testdb
    DEBUG: Allocating new connection to database testdb
    DEBUG: Executing SQL step 15 (part of job 5)
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Clearing inactive connections
    DEBUG: Connection stats: total - 5, free - 0, deleted - 0
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Destroying job thread for job 8
    DEBUG: Clearing inactive connections
    DEBUG: Connection stats: total - 5, free - 0, deleted - 0
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Clearing inactive connections
    DEBUG: Connection stats: total - 5, free - 0, deleted - 0
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Clearing inactive connections
    DEBUG: Connection stats: total - 5, free - 0, deleted - 0
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    DEBUG: Clearing inactive connections
    DEBUG: Connection stats: total - 5, free - 0, deleted - 0
    DEBUG: Checking for jobs to run
    DEBUG: Sleeping...
    
  • in vim /var/log/postgresql/postgresql-9.4-main.log I see only:

    [unknown]@[unknown] LOG:  incomplete startup packet
    LOG:  MultiXact member wraparound protections are now enabled
    LOG:  database system is ready to accept connections
    LOG:  autovacuum launcher started
    postgresext@postgres LOG:  could not receive data from client: Connection reset by peer
    postgresext@testdb LOG:  could not receive data from client: Connection reset by peer
    postgresext@postgres LOG:  could not receive data from client: Connection reset by peer
    postgresext@postgres LOG:  could not receive data from client: Connection reset by peer
    postgresext@testdb LOG:  could not receive data from client: Connection reset by peer
    postgresext@postgres LOG:  could not receive data from client: Connection reset by peer
    postgresext@postgres LOG:  could not receive data from client: Connection reset by peer
    postgresext@testdb LOG:  could not receive data from client: Connection reset by peer
    postgresext@testdb LOG:  could not receive data from client: Connection reset by peer
    postgresext@testdb LOG:  could not receive data from client: Connection reset by peer
    postgresext@postgres LOG:  could not receive data from client: Connection reset by peer
    

I cannot figure out what the actual problem and how to fix it actually?

One more thing I have investigated is that pgagent[6793]: *** Caught unhandled unknown exception; terminating can be invoked by incorrect connection termination...

Viktor M.
  • 4,393
  • 9
  • 40
  • 71

1 Answers1

0

A little bit late to answering this, but I ran into the same issues when using pgAgent, and it caused nothing but frustration. I even tried to dig into the source to fix it, but just trying to get it to compile on my server was a total nightmare due to the dependencies it requires (for no good reason).

So with that in mind, I decided to write a new job scheduler for Postgres as a drop in replacement for pgAgent. It's called jpgAgent, and i've been using it at my company for a couple months now with no issues at all. Much more stable now that we're using jpgAgent, don't have to worry about the agent just stopping on us randomly and jobs not getting run as they should, where as previously that was a common problem.

Anyways, hope it helps you out.

Adam B
  • 1