11

I got a little problem when I try to restore a large database (almost 32Go in custom format) on my devel database node (this node has less RAM, CPU... than my production server).

My database dumps are generated with a command similar to:

pg_dump -F custom -b myDB -Z 9 > /backup/myDB-`date +%y%m%d`.pg91

And when I restore it, I used the following command:

pg_restore -F custom -j 5 -d myDB /backup/myDB-20130331.pg91

But here, each time the restore command failed with an error like:

pg_restore: [archiver (db)] error returned by PQputCopyData: server closed the connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.
pg_restore: [archiver] worker process failed: exit code 1
pg_restore: [archiver (db)] error returned by PQputCopyData: server closed the connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.
pg_restore: [archiver (db)] error returned by PQputCopyData: server closed the connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.
pg_restore: [archiver (db)] error returned by PQputCopyData: server closed the connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.

And when I check my postgresql logs, I can read this:

   HINT:  In a moment you should be able to reconnect to the database and repeat your command.
   LOG:  all server processes terminated; reinitializing
   LOG:  database system was interrupted; last known up at 2013-04-02 11:41:48 UTC
   LOG:  database system was not properly shut down; automatic recovery in progress
   LOG:  redo starts at 86/26F302B0
   LOG:  unexpected pageaddr 85/E3F52000 in log file 134, segment 38, offset 16064512
   LOG:  redo done at 86/26F51FC0
   LOG:  last completed transaction was at log time 2013-04-02 11:50:47.663599+00
   LOG:  database system is ready to accept connections
   LOG:  autovacuum launcher started

It's quite strange, my postgresql server "restarts" alone just because of my restore. I try to minimize the number of jobs (-j 5 option) but still got the same problem. However on a node with better specs, I have no problem to restore this database. I'm not sure but maybe the updates of my indexes (one of them is really too large) could be a clue to understand this issue?

So I have some questions: is there a better way to restore really large database? Do I miss something in my pg_restore command? May be the settings of my devel server are too low?

Any clue will be greatly appreciated. Thank in advance.

env: PostgreSQL 9.1 (installed via Debian packages)

alfonx
  • 6,936
  • 2
  • 49
  • 58
Labynocle
  • 4,062
  • 7
  • 22
  • 24
  • Looks like a backend crash, but you'll need to show more of the logs to really say for sure or to know why. – Craig Ringer Apr 02 '13 at 13:39
  • Hi @CraigRinger, you suggest me to make the log more verbose to understand what's going on? Ok I'll try to do it and hope to see more info – Labynocle Apr 02 '13 at 13:43
  • You might also reduce again the number of jobs from 5 to just 2. That takes longer, but may be less demanding on your development node. – thisfeller Apr 02 '13 at 13:51
  • Hi @thisfeller, yes I already reduce the number of jobs but still have the same problem – Labynocle Apr 02 '13 at 15:05
  • 1
    If the development box is Intel, I'd suggest to test the memory with [memtest86](http://www.memtest86.com/), since that's the kind of symptom of faulty memory – Daniel Vérité Apr 02 '13 at 15:10
  • hi @DanielVérité, yep you right it could be a point even if I see nothing special about it, I'll try to test my memory as soon as possible – Labynocle Apr 02 '13 at 15:41
  • If you are running linux, your pg process is most likely the victim of the oom killer. You should have a look at the kernel logs to see if indeed the oom killer has been triggered by lack of free memory. – didierc Apr 02 '13 at 17:10
  • @Erwan No, I'm saying you should show more of the existing log, the last 30 lines or so rather than just the last 10. – Craig Ringer Apr 02 '13 at 23:23
  • @DanielVérité Checking `dmesg` for hardware error monitoring alerts can be useful on modern hardware too. It'll also show any OOM killer invocations. – Craig Ringer Apr 02 '13 at 23:24
  • I have a clue about my issue... I left the `autovacuum_max_workers` with the default value (which is 3) and with the value of `maintenance_work_mem` I set, it could exceed the amount of memory of my node (if all the workers are launched). I reduced the amount of workers and relaunch my restoration... let's see if it gonna work – Labynocle Apr 03 '13 at 08:48

1 Answers1

18

For this kind of big work, it is recommended to disable the autovacuum (by set it to off in your postgresql.conf) during the restoration process.

It seems it finally works for me.

Labynocle
  • 4,062
  • 7
  • 22
  • 24
  • 3
    This worked for me – can you provide a source (or more information) on why it's recommended to disable `autovacuum`, or what it's specifically doing that would crash the database? Thanks! – christianbundy Mar 20 '15 at 00:59
  • ``vaccum`` process could degrade the performance of your system (generate statistic, increase the I/O traffic...) so it's interesting to disable it to avoid this kind of problem. However, make ``vaccum`` is important to keep statistic uptodate but not necessary during the dump. Don't forget to manually launch it after or to enable the ``autovaccum`` daemon. – Labynocle Mar 20 '15 at 08:43
  • @Labynocle could you clarify it is crucial during dump or restore because in answer you are talking about restore and in comment about dump, or you simply mean both? ;) – andilabs Jan 21 '20 at 14:21
  • @andilabs in my case it was during the restore! – Labynocle Jan 21 '20 at 15:55
  • How to do this in PGadmin? – Abhishek Gautam Jan 07 '21 at 16:57
  • That can only work by coincidence, since autovacuum doesn't crash PostgreSQL. – Laurenz Albe Jun 09 '23 at 14:52
  • The command `alter system set autovacuum = off` in psql seems to work too. – geon Aug 31 '23 at 09:36