3

Example:

from multiprocessing.dummy import Pool as ThreadPool

def testfunc(string):
    print string

def main():

    strings = ['one', 'two', 'three', ...]
    pool = ThreadPool(10)
    results = pool.map(testfunc, strings)
    pool.close()
    pool.join()

if __name__ == '__main__':
    main()

This will not give us clear results with one result in one line:

one
two 
three

But mesh, that has random linebreaks, like

one 


two
three

four
five
...

Why does it happen? Can i output my data with one linebreak per function call?

P.S. Sometimes i have even no linebreaks or even spaces! P.P.S. Working under windows

avasin
  • 9,186
  • 18
  • 80
  • 127

4 Answers4

4

print is a non-atomic operation, so one print can be interrupted in the middle by another print in a different process. You can prevent two processes from calling print simultaneously by putting a Lock around it.

from multiprocessing.dummy import Pool as ThreadPool
from multiprocessing import Lock

print_lock = Lock()
def testfunc(string):
    print_lock.acquire()
    print string
    print_lock.release()

def main():

    strings = ['one', 'two', 'three', 'four', 'five']
    pool = ThreadPool(10)
    results = pool.map(testfunc, strings)
    pool.close()
    pool.join()

if __name__ == '__main__':
    main()
Kevin
  • 74,910
  • 12
  • 133
  • 166
3

Five years too late, but I think it may be worth mentioning that a quick and dirty solution to this is to explicitly include the linebreak at the end of the print:

print('message\n', end='')
OlleNordesjo
  • 133
  • 1
  • 7
2

Because workers (processes/threads according to the pool you're using) are not synchronized. You can use lock.

Or, instead of printing output in worker-processes, you can print it in main process.

def testfunc(string):
    return string

def main():
    strings = ['one', 'two', 'three', ...]
    pool = ThreadPool(10)
    results = pool.map(testfunc, strings)
    for result in results:
        print result
    pool.close()
    pool.join()
falsetru
  • 357,413
  • 63
  • 732
  • 636
  • @dano, You're right. `multiprocessing.dummy.Pool` return `multiprocessing.pool.ThreadPool`. Removed the part accordingly. Thank you for pointing that. – falsetru Oct 01 '14 at 14:11
1

All the threads are writing to the same output file, in this case it is stdout. So, before even one process finishes writing, the other threads are also writing to the same output file. Instead, you can gather the results from all the threads and print them at the main itself, like this

def testfunc(string):
    return string

...
...

    print "\n".join(pool.map(testfunc, strings))
thefourtheye
  • 233,700
  • 52
  • 457
  • 497