1

I'm trying to understand what operations are serialized and what are not with RxPY. So I printed out thread names and current delay in seconds during map and subscribe calls in the example below.

I was expecting the delay in seconds for map operation should be in [1,2,3,4,5] seconds. However, I've got [1,3,5,7,9] seconds for map. The additional delay of 2s for subscribe is expected due to time.sleep(2) during map. Why is that? It looks like map on the 2nd element won't start until subscribe call finished for the 1st element, despite 1st element and 2nd element have their respective threads for both map and subscribe.

import reactivex as rx
import concurrent
import time
from reactivex import operators as ops
from threading import current_thread
with concurrent.futures.ThreadPoolExecutor(5) as executor:
    start = time.time()
    rx.range(1, 6).pipe(
        ops.flat_map(lambda s: rx.from_future(executor.submit(lambda x: time.sleep(x) or x, s))),
        ops.map(lambda x: print('map', current_thread().name, time.time()-start,x) or time.sleep(2) or x)
    ).subscribe(lambda x: print('sub', current_thread().name,time.time()-start,x))

Gives output:

map ThreadPoolExecutor-21_0 1.0019810199737549 1
sub ThreadPoolExecutor-21_0 3.0042216777801514 1
map ThreadPoolExecutor-21_1 3.004584789276123 2
sub ThreadPoolExecutor-21_1 5.006811141967773 2
map ThreadPoolExecutor-21_2 5.007160663604736 3
sub ThreadPoolExecutor-21_2 7.008445978164673 3
map ThreadPoolExecutor-21_4 7.008780241012573 5
sub ThreadPoolExecutor-21_4 9.01101279258728 5
map ThreadPoolExecutor-21_3 9.01136064529419 4
sub ThreadPoolExecutor-21_3 11.013587951660156 4

How can I process these 5 elements using 5 threads w/o additional wait while keeping thread affinity (each element is only processed by the same thread across all operators in the pipe)? Similar to behavior below with ParallelStream/pseq provided by pyfunctional package but using ideally a threadpool instead of processes.

import time
from functional import pseq
from multiprocessing import current_process
start = time.time()
pseq(range(1, 6)).map(lambda x: time.sleep(x) or x)\
    .map(lambda x:print('map', current_process().name, time.time()-start, x) or time.sleep(2) or x)\
    .map(lambda x:print('sub', current_process().name, time.time()-start, x) or x)

with output

map ForkPoolWorker-80 1.0282073020935059 1
map ForkPoolWorker-81 2.031876802444458 2
sub ForkPoolWorker-80 3.0305001735687256 1
map ForkPoolWorker-82 3.035196304321289 3
sub ForkPoolWorker-81 4.034159898757935 2
map ForkPoolWorker-83 4.038431644439697 4
sub ForkPoolWorker-82 5.03748893737793 3
map ForkPoolWorker-84 5.038834571838379 5
sub ForkPoolWorker-83 6.040730953216553 4
sub ForkPoolWorker-84 7.0410990715026855 5
[1, 2, 3, 4, 5]
Shuming
  • 11
  • 2

0 Answers0