How to return the result of llama_index's index.query(query, streaming=True)?

Question

I am trying to return the result of llama_index's index.query(query, streaming=True).

But not sure how to do it.

This one obviously doesn't work.

index = GPTSimpleVectorIndex.load_from_disk(index_file)

return index.query(query, streaming=True)

Error message: TypeError: cannot pickle 'generator' object.

This one neither.

def stream_chat(query: str, index):
    for chunk in index.query(query, streaming=True):
        print(chunk)
        content = chunk["response"]
        if content is not None:
            yield content

# in another function
index = GPTSimpleVectorIndex.load_from_disk(index_file)
return StreamingResponse(stream_chat(query, index), media_type="text/html")

Error message: TypeError: 'StreamingResponse' object is not iterable.

Thanks!

score 3 · Accepted Answer · answered Apr 21 '23 at 18:00

3

Ok I figured it out.

The answer is

return StreamingResponse(index.query(query, streaming=True).response_gen)

answered Apr 21 '23 at 18:00

Taishi Kato

370
1
3
11

1

If anyone is trying to figure out how to do this with a python Flask server, just wrap the response_gen object in the stream_with_context() Flask helper In your routed method: `return stream_with_context(index.query(query, streaming=True).response_gen)` – jkriddle Apr 26 '23 at 15:10
Does anyone have advice on how to implement this in Django? – gmaly Jun 10 '23 at 13:26

Jayakrishnan · Answer 2 · 2023-06-01T19:15:20.867

0

If you are using flask, use steam_with_context. The final code would be as shown below.

return stream_with_context(index.as_query_engine(streaming=True).query(user_prompt).response_gen)

edited Jun 01 '23 at 19:15

answered May 29 '23 at 05:37

Jayakrishnan

563
5
13

How to return the result of llama_index's index.query(query, streaming=True)?

2 Answers2