Use this tag for questions about large language models (LLM), trained deep-learning artificial intelligence algorithms that interpret and generate natural language text.
Questions tagged [large-language-model]
118 questions
1
vote
1 answer
How to extract sub-string from Haystack's print_answers
I was following this tutorial from pinecone.io about using Haystack's print_answers
And as you can see in the later part of the tutorial, the output carries a lot of string. These string like output is not subscript-able and thus I'm not able to…

fast_crawler
- 91
- 7
1
vote
0 answers
Llama.generate: prefix-match hit
I am using "llama-2-7b-chat.ggmlv3.q2_K.bin" (from hugging-face) using "LlamaCpp()" in langchain. The process of "Llama.generate: prefix-match hit" repeats itself so many times and answers itself. But I want answer only once. How can I set this to…

Mayuresh Gawai
- 21
- 1
- 6
1
vote
0 answers
What is the purpose of the "prepare_model_for_int8_training" function in the LLama code?
I noticed that removing this line:
model = prepare_model_for_int8_training(model)
causes the model to produce a loss value of "nan" easily if I load model in 8bit. Can someone explain the necessity of this function?
Furthermore, the purpose of…

ysngki
- 11
- 2
1
vote
3 answers
Importing ConversationalRetrievalChain from langchain.chains isn't working
I am trying to follow various tutorials on langchain and streamlit and I have encountered many problems regarding the names of imports. My main problem is that I can't seem to import ConversationalRetrievalChain from langchain.chains. This isn't the…

Tristan Tucker
- 33
- 5
1
vote
1 answer
Fine tune an LLM NOT on question/answer dataset
Most of the material out there for tine tuning LLMs use a question/answer dataset for fine tuning. Problem is, that's not my use case. I would like to fine tune an LLM on domain knowledge which exists as a set of documents and that set can't really…

Demiurg
- 1,597
- 8
- 26
- 40
1
vote
1 answer
Relationship between embedding models and LLM's inference models in a RAG architecture
I am trying to implement a RAG architecture in AWS with documents that are in Spanish.
My question is the following: does it matter if I generate the embeddings of the documents with a model trained in English or multilingual? Or do I have to…

Josalo9
- 23
- 3
1
vote
2 answers
Why do I get the error "Unrecognized request argument supplied: functions" when using `functions` when calling Azure OpenAI GPT?
I'm trying to use functions when calling Azure OpenAI GPT, as documented in https://platform.openai.com/docs/api-reference/chat/create#chat/create-functions
I use:
import openai
openai.api_type = "azure"
openai.api_base =…

Franck Dernoncourt
- 77,520
- 72
- 342
- 501
1
vote
1 answer
Llama QLora error: Target modules ['query_key_value', 'dense', 'dense_h_to_4h', 'dense_4h_to_h'] not found in the base model
EDIT:
solved by removing target_modules
I tried to load Llama-2-7b-hf LLM with QLora with the following code:
model_id = "meta-llama/Llama-2-7b-hf"
tokenizer = AutoTokenizer.from_pretrained(model_id, use_auth_token=True) # I have permissions.
model…

Ofir
- 590
- 9
- 19
1
vote
0 answers
hugging face pipeline error from langchain PydanticUserError:
I'm having following error while trying to load the hugging face pipeline from langchain
PydanticUserError: If you use @root_validator with pre=False (the
default) you MUST specify skip_on_failure=True. Note that
@root_validator is deprecated and…

Abdul Basit
- 11
- 1
1
vote
0 answers
LangChain Agent that uses a tool multiple times until a stopping criteria is met
I want to create an agent with LangChain and followed one of their tutorials.
In my use case, I want to generate text with gpt and score the generated text with some kind of metrics. If the score of these metrics is too low, I want the agent to…

Kuehlschrank
- 11
- 3
1
vote
1 answer
Langchain agents
I have a problem using Langchain agent with Serpapi together with a local LLM. I have successfully do the same thing when I connect with OpenAI.
My local LLM is either the MPT-7B model and the 30B_Lazarus on text generation mode.
My codes looks…

zoomraider
- 117
- 1
- 9
1
vote
0 answers
Best Local Model for Running Questions on Docs
Looking for some recommendations with respect to ideal LLM models that fit the following criteria:-
Open Source
Downloadable (can be run offline on local server)
Can be used for answering questions on the basis of information contained in existing…

Abhay
- 87
- 7
1
vote
1 answer
Running LLM on a local server
I am new at LLM. I need to run a LLM on a local server and need to download different model to experiment. I am trying to follow this guide from HuggingFace https://huggingface.co/docs/transformers/installation#offline-mode
To begin with, I…

zoomraider
- 117
- 1
- 9
1
vote
0 answers
Replacing UI with LLMs
How can one replace the UI of an application with an LLM's chat window? The bot should be able to do everything it used to but via natural language. So the end user doesn't have to click at buttons or view options in a menu; rather, he/she should be…

theodre7
- 125
- 4
1
vote
2 answers
Is there a way to edit Langchain Agent's conversation as it is going on?
I'm using langchain to query a MySQL database, but langchain agents always go over OpenAI's 4k token limit. When I looked into the agent's conversation history, it seems like the agent called schema_sql_db multiple times and the table schemas took…

David
- 21
- 2