0

I have a method that continuously generates data and prints them on the console. Let's say something simple like generating random number:

def number_generator():
    while True:
        random.randint(1,100)
        time.sleep(0.5)
    return

I have a separate method that is supposed to take ths input and write this data to a Kafka topic.

How can I pass this data stream from number_generator() into another method.

P.S: I know I can just write the Kafka send() to topic method within number_generator() but I am to learning Python and was curious how data streams can be passed between methods, if they can be.

2 Answers2

0

To create a generator in python, you use the yield keyword:

def number_generator():
    while True:
        yield random.randint(1,100)

Then, you can instantiate this generator and pass it to a function which is intended to consume it, for example:

def print10(gen):
    n = 0
    for val in gen:
        print(val)
        n += 1
        if n >= 10:
            break

print10(number_generator())
gog
  • 10,367
  • 2
  • 24
  • 38
0

They way I understand your question you implicitly assuming some kind of multi threading or multi tasking. The easiest way however is keeping close to the name of your function and use a generator pattern:

def number_generator():
    while True:
        value = random.randint(1,100)
        time.sleep(0.5)
        yield value

gen = number_generator()
for num in gen:
    # do something
    print(num)

If you want to have the generator and worker run concurrently, you can use asyncio. This makes sense if wour worker also needs some time e.g. to fetch something from a socket. In this wait time the generator can now already fetch the next job.

import random
import asyncio

class my_async_generator():
    def __init__(self):
        self.counter = 0

    def __aiter__(self):
        return self

    async def __anext__(self):
        if self.counter >= 10:
            raise StopAsyncIteration
        await asyncio.sleep(1)
        self.counter += 1
        value = random.randint(1,100)
        return value
    
async def worker(item):
    await asyncio.sleep(5)
    print(item)


async def do_async_stuff():
    tasks = []
    async for num in my_async_generator():
        tasks.append(asyncio.create_task(worker(num)))
    await asyncio.gather(*tasks)

    
if __name__ == '__main__':
    asyncio.run(do_async_stuff())

Then you still have multi-tasking, which in the context of cpython ist not so much helpfull due to the GIL (global interpreter lock) and multi-processing. But for that you will need some kind of communication between the threads, because they don't share the same memory space anymore.

user_na
  • 2,154
  • 1
  • 16
  • 36