1
OpenSolaris derivate (NexentaStor), python 2.5.5

I've seen numerous examples and many seem to indicate that the problem is a deadlock. I'm not writing to stdin so I think the problem is that one of the shell commands exits prematurely.

What's executed in Popen is:

ssh <remotehost> "zfs send tank/dataset@snapshot | gzip -9" | gzip -d | zfs recv tank/dataset

In other words, login to a remote host and (send a replication stream of a storage volume, pipe it to gzip) pipe it to zfs recv to write to a local datastore.

I've seen the explanation about buffers but Im definitely not filling up those, and gzip is bailing out prematurely so I think that the process.wait() never gets an exit.

process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
process.wait()
if process.returncode == 0:
    for line in process.stdout:
        stdout_arr.append([line])
    return stdout_arr
else:
    return False

Here's what happens when I run and interrupt it

# ./zfs_replication.py 
gzip: stdout: Broken pipe

^CKilled by signal 2.
Traceback (most recent call last):
  File "./zfs_replication.py", line 155, in <module>
   Exec(zfsSendRecv(dataset, today), LOCAL)
  File "./zfs_replication.py", line 83, in Exec
    process.wait()
  File "/usr/lib/python2.5/subprocess.py", line 1184, in wait
    pid, sts = self._waitpid_no_intr(self.pid, 0)
  File "/usr/lib/python2.5/subprocess.py", line 1014, in _waitpid_no_intr
    return os.waitpid(pid, options)
KeyboardInterrupt

I also tried to use the Popen.communicat() method but that too hangs if gzip bail out. In this case the last part of my command (zfs recv) exits because the local dataset has been modified so the incremental replication stream will not be applied, so even though that will be fixed there has got to be a way of dealing with gzips broken pipes?

process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
stdout, stderr = process.communicate()

if process.returncode == 0:
    dosomething()
else:
    dosomethingelse()

And when run:

cannot receive incremental stream: destination tank/repl_test has been modified

since most recent snapshot
gzip: stdout: Broken pipe

^CKilled by signal 2.Traceback (most recent call last):

  File "./zfs_replication.py", line 154, in <module>
    Exec(zfsSendRecv(dataset, today), LOCAL)
  File "./zfs_replication.py", line 83, in Exec
    stdout, stderr = process.communicate()
  File "/usr/lib/python2.5/subprocess.py", line 662, in communicate
    stdout = self._fo_read_no_intr(self.stdout)
  File "/usr/lib/python2.5/subprocess.py", line 1025, in _fo_read_no_intr
    return obj.read()
KeyboardInterrupt
user135361
  • 125
  • 1
  • 6
  • Before executing a command with `Popen` you should try it at shell level. For what you show, you are zipping on distant but do not unzip on local ... – Serge Ballesta Nov 04 '14 at 08:49
  • Thanks for the notice, the missing decompress was added (and is of course in the script). The command itself works, but as the data is sent across a slow wan link it's quite normal for it to break, so it's important that this can be controlled and not just hung forever. – user135361 Nov 04 '14 at 09:01
  • Does the command print anything when you run it from the commandline? – Aaron Digulla Nov 04 '14 at 09:30
  • Only: "cannot receive incremental stream: destination tank/repl_test has been modified since most recent snapshot" – user135361 Nov 04 '14 at 10:15

0 Answers0