How to pass a CSV file or a dataframe to Pandas REPL tool in Langchain?

Question

I am trying to utilize Python Repl Tool in langchain with a CSV file and send me the answer based on the CSV file content. The problem is that it gets the action_input step of writing the code to execute right. However, it fails to answer because it couldn't determine the dataframe it should run the code on.

For example, I ask it about the longest name in a dataframe containing a column named "name" and it returns the following:

Entering new AgentExecutor chain... { "action": "python_repl_ast", "action_input": "import pandas as pd\n\n# Assuming the dataset is stored in a pandas DataFrame called 'data'\nnames = data['NAMES']\nlongest_name = max(names, key=len)\nlongest_name" } Observation: NameError: name 'data' is not defined Thought:{ "action": "Final Answer", "action_input": "I apologize for the confusion. Unfortunately, I do not have access to the dataset required to find the longest name. Is there anything else I can assist you with?" }

This is the full code: `

df = pd.read_csv(file_path)

tools = [PythonAstREPLTool(locals={"df": df})]

agent = initialize_agent(
    agent='chat-conversational-react-description',
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    early_stopping_method='generate',
    memory=conversational_memory
)

query = 'What is the longest name?'

print(agent(query))`

Is there a way to pass the dataframe object to the Pandas REPL tool in order for the code to execute properly and return me the answer? This problem is encountered while using the GPT-3.5-turbo API model.

Try PythonREPLTool with CSV agent :- https://stackoverflow.com/questions/76879308/data-visualization-with-langchain-create-sql-agent/76896903#76896903 — Rishi, Aug 14 '23 at 07:09

tuffgatmasta · Answer 1 · 2023-08-21T13:21:35.440

0

After you initialize the agent you can repurpose(?) it to take in a df which is relevant to the outputs, using the below

from langchain.agents import create_pandas_dataframe_agent
agent = create_pandas_dataframe_agent(
   OpenAI(temperature=0), df, verbose=True 
     )

edited Aug 21 '23 at 13:21

answered Aug 21 '23 at 13:06

tuffgatmasta

1
1

How to pass a CSV file or a dataframe to Pandas REPL tool in Langchain?

1 Answers1