Use this tag for questions about large language models (LLM), trained deep-learning artificial intelligence algorithms that interpret and generate natural language text.
Questions tagged [large-language-model]
118 questions
0
votes
0 answers
Adding Spaces to Language Model Output for User-Friendly Display
large language models generate a list of tokens, but to show these tokens to a user we need to add spaces between these tokens, so how do we know where to put these spaces to show an appropriate text?
example this text was tokenized using OpenAI…

hakim47
- 33
- 3
0
votes
0 answers
facing OS error trying to download the model in openLLM
I am trying to run the openLLM but i am facing this error.
I ran 3 commands.
!pip3 install openllm
!openllm -h
!openllm start opt
but the 3rd commands output is generating this error
OSError: model.safetensors or model.safetensors.index.json and…

mirdul agarwal
- 9
- 1
0
votes
0 answers
How to handle the language of an LLM response in a specific language?
I have been working on a RAG architecture with the Falcon-40b model and the implementation of the chatbot must be in Spanish. I understand that Falcon-40b is multilingual. So far the RAG has returned good responses but sometimes outputs the response…

Josalo9
- 23
- 3
0
votes
0 answers
langchain and vectorDB always returns different answer
i am playing with langchain, openai, and pinecone (vectordb).
i have generated a random list of toys in total 16. Each toy is in a row with a small description.
LEGO sets: LEGO offers a wide range of building sets, including
themed sets based on…

Khan
- 1,418
- 1
- 25
- 49
0
votes
1 answer
Getting connection refused error using openllm library of python
I am trying to utilise this github repo, particularly the below python code:
import openllm client = openllm.client.HTTPClient('http://localhost:3000') client.query('Explain to me the difference between "further" and "farther"')
But this is…
0
votes
0 answers
How to use "logit bias" in llama in meta scripts?
I've been performing classification using GPT-3/3.5/4 models by restricting outputs using the logit_bias parameter. I am not sure how to do the same in open source models, specifically llama, llama2, and their derivatives.
I have the model weights…

tanny411
- 1
0
votes
0 answers
Error while deploying this repo in AWS(timdettmers/guanaco-33b-merged LLM)
so I'm using AWS Sagemaker to host this model (timdettmers/guanaco-33b-merged ), and then using AWS lambda to host this model in a serverless architecture and API gateway, which acts as a trigger to the lambda function. After hosting the model in…

ashrey kaushik
- 1
- 1
0
votes
0 answers
VSCode custom code completion model architectural design
Background:
Let say I have a new framework that are build using dart programming language. I want to implement code completion in vscode that will auto generate code based on comment:
Example:
// Generate an component here with 4 inputs, the output…

Yao Jing Quek
- 31
- 4
0
votes
0 answers
How can I load a Loras model from Huggingface's leaderboard?
From open_llm_leaderboard, there are many interesting 30b loras model with extremely good performance
But HOW CAN I LOAD IT without adapter_config.json?
I am really sorry that I am new to the field, but if I didn't understand it wrongly, with a…

Johnsmith001
- 1
- 1
0
votes
0 answers
Falcon 7B LLM Evaluation using TruLens
The problem I am facing is after defining the prompt template, creating a chain using Langchain, defining the huggingface evaluation module from trulens_eval to check the toxicity of the response and then when finally passing the prompt through…
0
votes
0 answers
Parameter-Efficient BERT
I want to add extra layers into original BERT (adapter) where only the adapter will be trained during training while the original BERT network will be frozen. Is it possible to initialise the adapter wights with Kaiming initialisation?
(introduced…

Fahad Alghamdi
- 1
- 2
0
votes
0 answers
In NLTK, how to generate a sample of sentences from PCFG, respecting the probabilities
NLTK has a generate method which enumerates sentences for a given CFG. It also has a PCFG class for probabilistic context-free grammars. Is it possible generate a sample of sentences with respect to probabilities defined in PCFG?
For example, if I…

Albert Gevorgyan
- 171
- 3
- 10
0
votes
0 answers
Why LLama-2 has a largest size of 70b rather than 65b
Meta has just released LLama-2. The largest llama-2, unlike llama-1 which is 65b, has 70b parameters.
The only difference according to the report seems to rely on 'Group Query Attention' used by LLama-2, which can only decrease the number of…

Arist12
- 172
- 1
- 7
0
votes
0 answers
how to save only the output of the llm in langchain memory
i am going to analyze of bunch of documents to analyze the credit worthness of a company so due to context length i am going to feed them one by one and create a summary memory .
template = """You are a senior financial analyst a conversation with a…

Mohamed Amine
- 340
- 1
- 4
- 16
0
votes
0 answers
fastchat-t5-3b-v1.0 gives truncated /incomplete answers
I have used following embeddings:
sentence-transformers/all-mpnet-base-v2
hkunlp/instructor-xl
to get embedding
def getEmbedding():
device = "cuda" if torch.cuda.is_available() else "cpu"
return…

Mukilan
- 1
- 1