Questions tagged [large-language-model]

Use this tag for questions about large language models (LLM), trained deep-learning artificial intelligence algorithms that interpret and generate natural language text.

118 questions
0
votes
0 answers

Adding Spaces to Language Model Output for User-Friendly Display

large language models generate a list of tokens, but to show these tokens to a user we need to add spaces between these tokens, so how do we know where to put these spaces to show an appropriate text? example this text was tokenized using OpenAI…
hakim47
  • 33
  • 3
0
votes
0 answers

facing OS error trying to download the model in openLLM

I am trying to run the openLLM but i am facing this error. I ran 3 commands. !pip3 install openllm !openllm -h !openllm start opt but the 3rd commands output is generating this error OSError: model.safetensors or model.safetensors.index.json and…
0
votes
0 answers

How to handle the language of an LLM response in a specific language?

I have been working on a RAG architecture with the Falcon-40b model and the implementation of the chatbot must be in Spanish. I understand that Falcon-40b is multilingual. So far the RAG has returned good responses but sometimes outputs the response…
0
votes
0 answers

langchain and vectorDB always returns different answer

i am playing with langchain, openai, and pinecone (vectordb). i have generated a random list of toys in total 16. Each toy is in a row with a small description. LEGO sets: LEGO offers a wide range of building sets, including themed sets based on…
Khan
  • 1,418
  • 1
  • 25
  • 49
0
votes
1 answer

Getting connection refused error using openllm library of python

I am trying to utilise this github repo, particularly the below python code: import openllm client = openllm.client.HTTPClient('http://localhost:3000') client.query('Explain to me the difference between "further" and "farther"') But this is…
0
votes
0 answers

How to use "logit bias" in llama in meta scripts?

I've been performing classification using GPT-3/3.5/4 models by restricting outputs using the logit_bias parameter. I am not sure how to do the same in open source models, specifically llama, llama2, and their derivatives. I have the model weights…
0
votes
0 answers

Error while deploying this repo in AWS(timdettmers/guanaco-33b-merged LLM)

so I'm using AWS Sagemaker to host this model (timdettmers/guanaco-33b-merged ), and then using AWS lambda to host this model in a serverless architecture and API gateway, which acts as a trigger to the lambda function. After hosting the model in…
0
votes
0 answers

VSCode custom code completion model architectural design

Background: Let say I have a new framework that are build using dart programming language. I want to implement code completion in vscode that will auto generate code based on comment: Example: // Generate an component here with 4 inputs, the output…
0
votes
0 answers

How can I load a Loras model from Huggingface's leaderboard?

From open_llm_leaderboard, there are many interesting 30b loras model with extremely good performance But HOW CAN I LOAD IT without adapter_config.json? I am really sorry that I am new to the field, but if I didn't understand it wrongly, with a…
0
votes
0 answers

Falcon 7B LLM Evaluation using TruLens

The problem I am facing is after defining the prompt template, creating a chain using Langchain, defining the huggingface evaluation module from trulens_eval to check the toxicity of the response and then when finally passing the prompt through…
0
votes
0 answers

Parameter-Efficient BERT

I want to add extra layers into original BERT (adapter) where only the adapter will be trained during training while the original BERT network will be frozen. Is it possible to initialise the adapter wights with Kaiming initialisation? (introduced…
0
votes
0 answers

In NLTK, how to generate a sample of sentences from PCFG, respecting the probabilities

NLTK has a generate method which enumerates sentences for a given CFG. It also has a PCFG class for probabilistic context-free grammars. Is it possible generate a sample of sentences with respect to probabilities defined in PCFG? For example, if I…
0
votes
0 answers

Why LLama-2 has a largest size of 70b rather than 65b

Meta has just released LLama-2. The largest llama-2, unlike llama-1 which is 65b, has 70b parameters. The only difference according to the report seems to rely on 'Group Query Attention' used by LLama-2, which can only decrease the number of…
Arist12
  • 172
  • 1
  • 7
0
votes
0 answers

how to save only the output of the llm in langchain memory

i am going to analyze of bunch of documents to analyze the credit worthness of a company so due to context length i am going to feed them one by one and create a summary memory . template = """You are a senior financial analyst a conversation with a…
Mohamed Amine
  • 340
  • 1
  • 4
  • 16
0
votes
0 answers

fastchat-t5-3b-v1.0 gives truncated /incomplete answers

I have used following embeddings: sentence-transformers/all-mpnet-base-v2 hkunlp/instructor-xl to get embedding def getEmbedding(): device = "cuda" if torch.cuda.is_available() else "cpu" return…