0

I have two scripts parent.py and child.py The parent.py calls child.py as a subprocess. Child.py has a function that collects certain result in a dictionary and i wish to return that dictionary back to the parent process. I have tried by printing that dictionary from child.py onto its STDOUT so that the parent process can read it but then thats not helping me as the dictionary's content are being read as strings on seperate lines by the parent.

Moreover , as suggested in comments , i tried serializing the dictionary using JSON while printing it on stdout and also read it back from the parent using JSON , that works fine, but i also am printing a lot of other information from the child to its stdout which is eventually also being read by the parent and is mixing things up .

Another suggestion that came up was by writing the result from the child to a file in the directory and make the parent read from that file. That would work too , but i would be running 100s of instances of this code in Celery and hence it would lead to overwrites on that same file by other instances of the child.

My question is since we have a PIPE connecting the two processes how can i just write my dictionary directly into the PIPE from child.py and get it read from the parent.py

# parent.py

import subprocess

proc = subprocess.Popen(['python3', 'child.py'],
                        stdin=subprocess.PIPE,
                        stdout = subprocess.PIPE
                        )
proc.comunicate()
result = proc.stdout
#child.py

def child_function():
    result = {}
    result[1] = "one"
    result[2] = "two"
    print(result)
    #return result
    
if __name__ == "__main__":
    child_function()
  • In order to transmit the dictionary through a pipe, you need to serialize it, for example as JSON. But, why do you not just import and call `child_function()` from parent.py? – mkrieger1 Oct 19 '20 at 10:02
  • You can use JSON for serializing data: `print(json.dumps(result))` in *child.py* and `result = json.loads(proc.stdout)` in *parent.py*. – Ionut Ticus Oct 19 '20 at 10:02
  • @mkrieger1 Thanks for replying . Yup , i wish i could just import child_function() from parent but i am in a situation because of which i do have to call the child.py as a subprocess as am running parent as root but some functionality present in child.py does not work as root user , hence am calling child.py via subprocess with a seperate non-root user. – deejangorebaba Oct 19 '20 at 10:08
  • @IonutTicus Thank you for suggestion , i will try that. – deejangorebaba Oct 19 '20 at 10:08
  • @IonutTicus Isnt there any way to directly pass the dicitonary from child.py to parent via the PIPE instead of "printing it first to the child's STDOUT" and then make the parent read it from the child's stdout. I mean my intent is why to print it on STDOUT when we are already connected via a PIPE , why can't we just pass that value directly via the PIPE to its parent ? Reason why am looking for it is that i am also printing a lot of other stuff from child to its STDOUT (debug info) and those other things will mess with the dictionary if i print it . – deejangorebaba Oct 19 '20 at 10:24

3 Answers3

2

A subprocess running Python is in no way different from a subprocess running something else. Python doesn't know or care that the other program is also a Python program; they have no access to each other's variables, memory, running state, or other internals. Simply imagine that the subprocess is a monolithic binary. The only ways you can communicate with it is to send and receive bytes (which can be strings, if you agree on a character encoding) and signals (so you can kill your subprocess, or raise some other signal which it can trap and handle -- like a timer; you get exactly one bit of information when the timer expires, and what you do with that bit is up to the receiver of the signal).

To "serialize" information means to encode it in a way which lets the recipient deserialize it. JSON is a good example; you can transfer a structure consisting of a (possibly nested structure of) dictionary or list as text, and the recipient will know how to map that stream of bytes into the same structure.

When both sender and receiver are running the same Python version, you could also use pickles; pickle is a native Python format which allows you to transfer a richer structure. But if your needs are modest, I'd simply go with JSON.

parent.py:

import subprocess
import json

# Prefer subprocess.run() over bare-bones Popen()
proc = subprocess.run(['python3', 'child.py'],
    check=True, capture_output=True, text=True)
result = json.loads(proc.stdout)

child.py:

import json
import logging

def child_function():
    result = {}
    result[1] = "one"
    result[2] = "two"
    loggging.info('Some unrelated output which should not go into the JSON')
    print(json.dumps(result))
    #return result
    
if __name__ == "__main__":
    logging.basicConfig(level=logging.WARNING)
    child_function()

To avoid mixing JSON with other output, print the other output to standard error instead of standard output (or figure out a way to embed it into the JSON after all). The logging module is a convenient way to do that, with the added bonus that you can turn it off easily, partially or entirely (the above example demonstrates logging which is turned off via logging.basicConfig because it only selects printing of messages of priority WARNING or higher, which excludes INFO). The parent will get these messages in proc.stderr.

tripleee
  • 175,061
  • 34
  • 275
  • 318
2

Have the parent create a FIFO (named pipe) for the child:

with os.mkfifo(mypipe) as pipe:
    proc = subprocess.Popen(['python3', 'child.py', 'mypipe'],
            stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    print(pipe.read())

Now the child can do this:

pipe_path = # get from argv
with open(pipe_path, 'w') as pipe:
    pipe.write(str(result))

This keeps your communication separate from stdin/stdout/stderr.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
1

You can get the results via a file.

parent.py:

import tempfile
import os
import subprocess
import json


fd, temp_file_name = tempfile.mkstemp() # create temporary file
os.close(fd) # close the file
proc = subprocess.Popen(['python3', 'child.py', temp_file_name]) # pass file_name
proc.communicate()
with open(temp_file_name) as fp:
    result = json.load(fp) # get dictionary from here
os.unlink(temp_file_name) # no longer need this file

child.py:

import sys
import json


def child_function(temp_file_name):
    result = {}
    result[1] = "one"
    result[2] = "two"
    with open(temp_file_name, 'w') as fp:
        json.dump(result, fp)

    
if __name__ == "__main__":
    child_function(sys.argv[1]) # pass the file name argument
Booboo
  • 38,656
  • 3
  • 37
  • 60
  • Thanks for the reply, even i had thought of this solution but the thing is that this whole thing would be sent to celery (task queues) and almost 100s of instances of this code would be running up parallely and i fear that that would definitely lead to overwrites onto that file because , again, 100s of instances would be running the same code. – deejangorebaba Oct 19 '20 at 11:52
  • I would have to see more of the code, but it's not clear just from your description why each instance would not have its own unique temporary file. In other words, I don't see this being vastly different than the answer that used `subprocess.run)` and retrieved the JSON from stdout, at least as far as overwriting anything is concerned. That solution may be more straightforward and I offered this just as another way that wouldn't require you to modify the way you are using stdout or any other aspect of your code. – Booboo Oct 19 '20 at 12:39
  • Thanks for the reply , i am still looking into this and other aspects too.You are correct in what you have said in later part of your comment. – deejangorebaba Oct 19 '20 at 12:48