how to improve my prompt while using meta-llama/Llama-2-13b-chat-hf

Question

When I using meta-llama/Llama-2-13b-chat-hf the answer that model give is not good. I think is my prompt using wrong. below is my code

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import ElasticVectorSearch, Pinecone, Weaviate, FAISS, Chroma
from langchain.chains.question_answering import load_qa_chain
from langchain.llms import HuggingFacePipeline
from langchain.prompts import PromptTemplate
import transformers
from langchain.chains import ConversationChain
from langchain.memory import ConversationSummaryBufferMemory,ConversationBufferMemory,ConversationSummaryMemory
import torchimport os
from langchain import OpenAI
os.environ['OPENAI_API_KEY'] = 'My key'
bnb_config = transformers.BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)
model_id = 'meta-llama/Llama-2-13b-chat-hf'
hf_auth = '***'
model_config = transformers.AutoConfig.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)
model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    config=model_config,
    quantization_config=bnb_config,
    device_map='auto',
    use_auth_token=hf_auth
)
model.eval()
tokenizer = transformers.AutoTokenizer.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)
instruct_pipeline = transformers.pipeline(
    task='text-generation',
    model=model,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    tokenizer=tokenizer,
    return_full_text=True,
    max_new_tokens=512,
    top_p=0.99,
    top_k=50,
    repetition_penalty=1.1,
    temperature=0.01
)
hf_pipe = HuggingFacePipeline(pipeline=instruct_pipeline)
prompt_template1 = """<s>[INST] <<SYS>>
{{ You are a AI chatbot having a conversation with a human. Given the following has three part. First part is a extracted parts of a long document. Second part is the conversation between you and human. Third part is the human's question.
If human's question can't use extracted parts to answer, just chat normally with human. If human's question can use extracted parts to answer, please based on the extracted parts to answer human.

Extracted parts:
###
There are 5 steps to find password back.
STEP 1
Go to MEMBER CENTER Click SECURITY CENTER
STEP 2
Select SECURITY CENTER
STEP 3
Select TRANSACTION PASSWORD
STEP 4
Select FORGOT PASSWORD
STEP 5
for bound EMAIL
Enter your BOUND E-MAIL
then you will receive an email with your new password
for bound PHONE NUMBER
Enter your BOUND PHONE NUMBER
###

Previous Conversation:
'''
{history}
'''

Human's question: ```{input}``` }}
<</SYS>>


"""
prompt = PromptTemplate(template=prompt_template1, input_variables=['input', 'history'])
summary_memory = ConversationSummaryBufferMemory(llm=OpenAI(), max_token_limit=20)
conversation = ConversationChain(
    prompt=prompt,
    llm=hf_pipe,
    verbose=True,
    memory=summary_memory,
)

the Extracted parts is searched by Embeddings I just type one of it to test I create a ConversationSummaryBufferMemory to memory conversation but seem like meta-llama has own prompt to memory conversation. I do not know how to use. Please help me !

meta-llama can reply the answer when user ask question that is associated with the Extracted parts. otherwise, just chat with user normally.

Llama has a specific format for the conversation between system and user. You can check this function (https://huggingface.co/TheBloke/Llama-2-13B-chat-GPTQ/discussions/5#64b8e6cdf8bf823a61ed1243), it'll help you generate prompts with past inputs — SajanGohil, Aug 18 '23 at 10:47
@SajanGohil Thanks for reply. According you say, I dont need to build a conversationchain to memory. llama will do that for me? — AndyLinOuO, Aug 21 '23 at 04:00
@doneforaiur. i ask "Hi! I am Andy" the model reply me "Based on your previous message, it seems like you're looking for help with recovering your transaction password. I can assist you with that! To start, can you tell me the email address or phone number that you used to bind your transaction password? This information will help me guide you through the process of resetting your password." — AndyLinOuO, Aug 21 '23 at 04:03
it is very wired. seems like model is consider that i will ask password question but i just introduce myself — AndyLinOuO, Aug 21 '23 at 04:09
I'd add "If the user doesn't initiate the conversation about password recovery, chat normally.". But even for me, it is really hard to understand the first 2 lines of your system prompt. :/ I'd clarify and simplify that part. — doneforaiur, Aug 21 '23 at 04:13
but the document inculding another question not just password, I change first 2 lines into this "You are a helpful assistant, you always only answer for the assistant then you stop. read the chat history to get context. Given the following is a solution of a specific problem. If human's question isn's associated with the following, just chat normally with human." — AndyLinOuO, Aug 21 '23 at 05:48

how to improve my prompt while using meta-llama/Llama-2-13b-chat-hf

0 Answers0