Use this tag for questions about large language models (LLM), trained deep-learning artificial intelligence algorithms that interpret and generate natural language text.
Questions tagged [large-language-model]
118 questions
1
vote
1 answer
Trying to install guanaco (pip install guanaco) for a text classification model but getting error
I'm trying to install the guanaco language model https://arxiv.org/abs/2305.14314 using pip install guanaco for a text classification model but getting error.
Failed to build guanaco
ERROR: Could not build wheels for guanaco, which is required to…

Tamanna -
- 41
- 5
1
vote
1 answer
How can I run some inference on the MPT-7B language model?
I wonder how I can run some inference on the MPT-7B language model. The documentation page on MPT-7B language model on huggingface doesn't mention how to run the inference (i.e., given a few words, predict the next few words).

Franck Dernoncourt
- 77,520
- 72
- 342
- 501
1
vote
1 answer
How to generate sentiment scores using predefined aspects with deberta-v3-base-absa-v1.1 Huggingface model?
I have a dataframe , where there is text in 1st column and predefine aspect in another column however there is no aspects defined for few text ,for example row 2.
data = {
'text': [
"The camera quality of this phone is amazing.",
…

Dexter1611
- 492
- 1
- 4
- 15
1
vote
0 answers
mT5 Question/Answering fine tuning is generating empty sentences during inference
mT5-small Question Answering training is converging to high accuracy, high validation accuracy, near-zero low loss; however, when testing the model on trained questions, I am always receiving empty answers.
Experiment Language: Arabic
Dataset used:…

Moustafa Banbouk
- 73
- 1
- 5
1
vote
0 answers
Getting CUDA out of memory when calling save_pretrained in a script that tries lora training a large language model using huggingface
I am trying to train a LLama LLM ("eachadea/vicuna-13b-1.1") using LoRA on a LambdaLabs A100 40 GB.
Everything seems to be working fine including the training, however the script fails on the last line:…

Ray Hulha
- 10,701
- 5
- 53
- 53
1
vote
1 answer
Is it possible to build a text classifier using existing LLM like chatgpt?
Pre LLM, when I want to build a text classifier (e.g., a sentiment analysis model, when given an input text, it returns "positive" or "neutral" or "negative"), I'll have to gather tons of data, choose a model architecture, and spend resources…

Eumaa
- 971
- 2
- 15
- 38
1
vote
1 answer
Issue with authorization with Ably when trying to run Pinecone Demo chat app
I have been trying to get the Pinecone Demo chat app in their own website up and running.Link to it. I have put all the keys properly in the .env file and the UI seems to pop up correctly. However, it shows the error:
[TypeError: Cannot read…

AaravS
- 23
- 3
1
vote
1 answer
llama_index with LLM doing out of context answering
I am using llama_index with custom LLM. LLM I have used is open assistant Pythia model.
My code :
import os
from llama_index import (
GPTKeywordTableIndex,
SimpleDirectoryReader,
LLMPredictor,
ServiceContext,
PromptHelper
)
from…

Ankit Bansal
- 2,162
- 8
- 42
- 79
1
vote
1 answer
Problem with custom metric for custom T5 model
I have created a custom dataset and trained on it a custom T5ForConditionalGeneration model that predicts solutions to quadratic equations like this:
Input: "4*x^2 + 4*x + 1"
Output: D = 4 ^ 2 - 4 * 4 * 1 4 * 1 4 * 1 4 * 1 4 * 1 4
I need to get…

ALiCe P.
- 231
- 1
- 10
1
vote
1 answer
How to add new tokens to an existing Huggingface tokenizer?
How to add new tokens to an existing Huggingface AutoTokenizer?
Canonically, there's this tutorial from Huggingface https://huggingface.co/learn/nlp-course/chapter6/2 but it ends on the note of "quirks when using existing tokenizers". And then it…

alvas
- 115,346
- 109
- 446
- 738
1
vote
1 answer
Loading Multiple LoRA bins
I wish to fine-tune a base LLM model using LoRA with multiple datasets that are structured differently (different columns and data types). I have two questions:
Can I fine-tune the model with the first dataset, then add/fine-tune the generated LoRA…

karim1104
- 13
- 4
1
vote
1 answer
BioGPT causal language model with unexpected error
I am trying to use a Causal Language Model from BioGPT. However, I got a strange error.
Here are my steps:
First, I installed transformers and sacremoses:
!pip install transformers sacremoses -q
Then I executed the following code:
input_sequence =…

tobias
- 501
- 1
- 6
- 15
1
vote
1 answer
Databricks Dolly LLM: empty result when using LangChain with context
I'm following a tutorial on HuggingFace (let's say this one though getting same result with other Dolly models). I am trying to run predictions with context but receiving empty string as an output. I tried different models and text…

Nik
- 161
- 1
- 13
1
vote
1 answer
How to use pipeline for multiple target language translations with M2M model in Huggingface?
The M2M model is trained on ~100 languages and able to translate different languages, e.g.
from transformers import pipeline
m2m100 = pipeline('translation', 'facebook/m2m100_418M', src_lang='en', tgt_lang="de")
m2m100(["hello world", "foo…

alvas
- 115,346
- 109
- 446
- 738
1
vote
1 answer
Can mT5 model on Huggingface be used for machine translation?
The mT5 model is pretrained on the mC4 corpus, covering 101 languages:
Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish,…

alvas
- 115,346
- 109
- 446
- 738