4

the following code has 5 workers .... each opens its own worker_task()

with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    future_to_url = {executor.submit(worker_task, command_, site_): site_ for site_ in URLS}

    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try: data = future.result()

BUT ..... inside each worker_task() ...... I cannot identify ... which of the 5 workers is currently being used (Worker_ID)

If I want to print('worker 3 has finished') inside worker_task() ..... I cannot do this because executor.submit does not allow

Any solutions?

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
Rhys
  • 4,926
  • 14
  • 41
  • 64
  • https://stackoverflow.com/questions/65480812/identifying-thread-running-function-using-concurrent-futures-thread-pool-in-pyth --> this should be helpful. – njari Aug 04 '21 at 08:46
  • thank you, this does not assign a **worker** to my `URLS` list ... I would need to create a **done/pending** variable for every `URLS` ... which cannot even fit into the `executor.submit()` statement ........ this link provides the ability to track thread ID's ..... but doing so is useless without linking a task URL with a worker – Rhys Aug 04 '21 at 10:13
  • Any feedback please? – Artiom Kozyrev Aug 06 '21 at 13:05

1 Answers1

6

You can get name of worker thread with the help of threading.current_thread() function. Please find some example below:

from concurrent.futures import ThreadPoolExecutor, Future
from threading import current_thread
from time import sleep
from random import randint

# imagine these are urls
URLS = [i for i in range(100)]


def do_some_work(url, a, b):
    """Simulates some work"""
    sleep(2)
    rand_num = randint(a, b)
    if rand_num == 5:
        raise ValueError("No! 5 found!")
    r = f"{current_thread().getName()}||: {url}_{rand_num}\n"
    return r


def show_fut_results(fut: Future):
    """Callback for future shows results or shows error"""
    if not fut.exception():
        print(fut.result())
    else:
        print(f"{current_thread().getName()}|  Error: {fut.exception()}\n")


if __name__ == '__main__':
    with ThreadPoolExecutor(max_workers=10) as pool:
        for i in URLS:
            _fut = pool.submit(do_some_work, i, 1, 10)
            _fut.add_done_callback(show_fut_results)

If you want more control over threads, use threading module:

from threading import Thread
from queue import Queue
from time import sleep
from random import randint
import logging

# imagine these are urls
URLS = [f"URL-{i}" for i in range(100)]

# number of worker threads
WORKER_NUM = 10


def do_some_work(url: str, a: int, b: int) -> str:
    """Simulates some work"""
    sleep(2)
    rand_num = randint(a, b)
    if rand_num == 5:
        raise ValueError(f"No! 5 found in URL: {url}")
    r = f"{url} = {rand_num}"
    return r


def thread_worker_func(q: Queue, a: int, b: int) -> None:
    """Target function for Worker threads"""
    logging.info("Started working")
    while True:
        try:
            url = q.get()

            # if poison pill - stop worker thread
            if url is None:
                break

            r = do_some_work(url, a, b)
            logging.info(f"Result: {r}")
        except ValueError as ex:
            logging.error(ex)
        except Exception as ex:
            logging.error(f"Unexpected error: {ex}")

    logging.info("Finished working")


if __name__ == '__main__':
    logging.basicConfig(
        level=logging.INFO,
        format="%(levelname)s | %(threadName)s | %(asctime)s | %(message)s",
    )
    in_q = Queue(50)
    workers = [
        Thread(target=thread_worker_func, args=(in_q, 1, 10, ), name=f"MyWorkerThread-{i+1}")
        for i in range(WORKER_NUM)
    ]
    [w.start() for w in workers]

    # start distributing tasks
    for _url in URLS:
        in_q.put(_url)

    # send poison pills to worker-threads
    for w in workers:
        in_q.put(None)

    # wait worker thread to join Main Thread
    logging.info("Main Thread waiting for Worker Threads")
    [w.join() for w in workers]

    logging.info("Workers joined")
    sleep(10)
    logging.info("App finished")
Artiom Kozyrev
  • 3,526
  • 2
  • 13
  • 31
  • I want to send the name of thread `thread1` ... into `do_some_work()` .......... I am trying to learn what the name is, and how to summon it – Rhys Aug 08 '21 at 00:53
  • however, I realize now ... I could create a **list [1,10] idle threads**. then `randomly select` an **idle** thread ... and assign it within the function call `submit()` – Rhys Aug 08 '21 at 00:55
  • your random number usage ... perhaps is the key to answer .... Prefer to see selection of **[idle number]** such as **thread1 [busy], thread2 [busy], thread3 [selected]** .............. instead of `if rand_num == 5:` ...... Thread_Assignment for printing `[thread1 loaded texture]` inside the `do_some_work()` – Rhys Aug 08 '21 at 01:06
  • 1
    @Rhys if you want to control name of thread, you have to use `threading.Thread` instead of `concurrent.futuretes.ThreadPoolExecutor`. If you take a look at `ThreadPoolExecutor` source code, you will quickly realize that it is wrapper around `threading.Thread`: https://github.com/python/cpython/blob/3.9/Lib/concurrent/futures/thread.py If you really need more control, e.g. control threads names, choose `threading` module. I can provide you with some example if you wish. – Artiom Kozyrev Aug 08 '21 at 11:51
  • Thanks Artiom ... If you do provide the `threading.Thread` example I will mark it as correct ... otherwise I do have examples available ... I've worked in `threading.Thread` ... – Rhys Aug 09 '21 at 23:07
  • @Rhys added to asnwer example based on `threading.Thread` and `producer-consumers` pattern with `poison pill` technique – Artiom Kozyrev Aug 10 '21 at 09:09