1

I'm currently working on debugging an application which uses the Langchain library, a Python-based language model library/framework. The application also uses the OpenAI Python client library to send requests to the OpenAI API.

During my debugging process, I want to view the raw prompts generated by the application that are sent to the OpenAI library and subsequently to the requests library. I'm assuming that these prompts are generated by a method or function within the Langchain library, but I'm unsure how to access or print these prompts for review.

Moreover, I'm also interested in a more general approach that would allow me to extract and display prompts sent to the OpenAI API from any other application, regardless of the underlying framework. This would be particularly useful when developing future applications that might use different frameworks (other than Langchain), but still leverage the OpenAI library.

Can anyone suggest effective ways to achieve these goals? Is it possible to modify the OpenAI Python library itself, or to use tools like Wireshark, Fiddler, or the Python logging library to intercept HTTP requests and view the prompts?

I'm looking for an approach that is both comprehensive and compliant with OpenAI's usage policies. Any help would be greatly appreciated!

In addition to the above, I would like to share an approach that I've attempted to extract prompts from the OpenAI library's debug-level logs.

The log entries look like this:

DEBUG:openai:api_version=None data='{"prompt": ["\\nToday is Monday, tomorrow is Wednesday.\\n\\nWhat is wrong with that statement?\\n"], "model": "text-davinci-003", "temperature": 0.7, "max_tokens": 256, "top_p": 1, "frequency_penalty": 0, "presence_penalty": 0, "n": 1, "logit_bias": {}}' message='Post details'

To parse these logs, I implemented a Python script as follows:

import sys
import re
import json

def extract_prompt_list(line: str):
    match = re.search(r"DEBUG:openai:.*?data='(.*?)'", line)
    if match:
        data_string = match.group(1)
        data = json.loads(data_string)
        return data['prompt']
    return []

prompt_lists = (extract_prompt_list(line) for line in sys.stdin)
for prompt_list in prompt_lists:
    if prompt_list:
        for prompt in prompt_list:
            print(f'[PROMPT] {prompt}')

In order to capture the debug-level logs, I also had to modify my application's logging settings as follows:

import logging

logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)

# ---- the rest is my original application code ---

And finally, I started the application from the command-line using this:

python my_app.py 2> >(python extract.py)

However, I feel that this approach has several shortcomings:

  • It introduces a significant amount of additional code.
  • It requires modifications to the original application code, such as adjusting the logging level.
  • The command-line invocation has become complex and hard to manage.

Given these challenges, I'm looking for alternative ways to achieve my goal. Any suggestions or improvements on this approach are welcome!

washingweb
  • 108
  • 5

0 Answers0