1

Very little experience of Python, struggling with the best pattern here.

I have a generator function, and I would like this function to yield the return value of another function until exhausted. This is using the streaming feature of Haystack's PromptNode, although that's really neither here nor there, just the pattern I'm after.

It is a little chatbot whose results I am trying to stream using gRPC. That bit is working, it's just getting it to yield correctly.

My generator looks like this:

class ChatBot(chat_pb2_grpc.ChatBotServicer):

    def AskQuestion(self, request, context):
        query = request.query
        custom_handler = GRPCTokenStreamingHandler()
        prompt_node = PromptNode(
            "gpt-4",
            default_prompt_template=lfqa_prompt,
            api_key=api_key,
            max_length=4096,
            model_kwargs={"stream": True, "stream_handler": custom_handler}
        )

        pipe = Pipeline()
        pipe.add_node(component=retriever, name="retriever", inputs=["Query"])
        pipe.add_node(component=prompt_node, name="prompt_node", inputs=["retriever"])

        output = pipe.run(query=query)

        # This should yield the "return token_received" from custom_handler
        #     yield chat_pb2.Response(token=token, final=False)

        yield chat_pb2.Response(token="", final=True)

And my GRPCTokenStreamingHandler looks like this:

class GRPCTokenStreamingHandler(TokenStreamingHandler):

    def __call__(self, token_received, **kwargs) -> str:
        return token_received

The way the PromptNode functions is that the output, where ordinarily it would print token after token to the console, is instead directed to this GRPCTokenStreamingHandler class.

So each of those token_received should be yielded by the AskQuestion generator.

How can I do this?

serlingpa
  • 12,024
  • 24
  • 80
  • 130

1 Answers1

2

I don't think you really need a generator to accomplish your goal in this case.

If you want to stream token by token you must send the received token via gRPC in your custom GRPCTokenStreamingHandler.

The streaming handler is used by the PromptNode internally and called each time part of the streamed response is received from ChatGPT in this case.

Pipeline.run() is blocking also, when that returns it means that you received all the tokens you could have received.

Silvano Cerza
  • 954
  • 5
  • 16