Use this tag for questions about large language models (LLM), trained deep-learning artificial intelligence algorithms that interpret and generate natural language text.
Questions tagged [large-language-model]
118 questions
0
votes
0 answers
Deepspeed on dolly-7B not using all GPUs while inferencing
I followed their official tutorial for inferencing using Deepspeed
However, I keep getting CUDA OOM error. When I check GPU usage, it seems, instead of consuming 4 available 24Gi GPUs, it is using single GPU.
To reproduce code:
generator =…

Shirish Bajpai
- 608
- 1
- 5
- 16
0
votes
0 answers
Loading Megatron NLP Pretrained Model and Training it with my own data. Errors
I am getting errors. My most recent one being: ImportError: cannot import name 'LightningDistributedModule' from 'pytorch_lightning.overrides'.
I'm trying to load a pre-trained model and then teach it with other files. I have the links to these…

Hal9000AIML
- 1
- 1
0
votes
0 answers
How to prepare dataset with multiple answers for single question to train the GPT3/davinci model
I am trying to fine-tune the GPT model, and for that, I have 3 columns: context, question, and answer. but I have multiple answers to a question. I have repeated question text for multiple answers, what is the best way to prepare an optimized…
0
votes
2 answers
GPT3 : from next word to Sentiment analysis, Dialogs, Summary, Translation ....?
How does GPT3 or other model goes from next word prediction to do Sentiment analysis, Dialogs, Summaries, Translation .... ?
what is the idea and algorithms ?
How does it work ?
F.e. generating paragraph is generate next word then the next…

sten
- 7,028
- 9
- 41
- 63
-1
votes
1 answer
Can BERT or LLM be used for sentence - word recommendation?
I'm junior data analyst.
I'm looking for method for Sentence-> word recommendation.
For example, if I input 'the little mermaid' and book's introduction(sentence), the model can put out 'swim suit' or 'fish doll'.
My knowledge about NLP is beginner…

yoojinyoon
- 9
- 1
-1
votes
0 answers
A LLM embeds my full project and continuously help me develop
TL;DR
What about embedding your whole repository and ask GPT to modify a part of it, reflecting changes each time?
Is there already a repo for doing it?
**I searched auto-gpt, but it is just automating the whole code generation process. It just…

user19023975
- 1
- 1
-1
votes
1 answer
Large Language Model Runs out of memory no matter which Hyperparameters I change - GeForce GTX 3060Ti
I have been trying to fix this for a few days. The problem is that it runs out of memory because my training data is quite large. I have a system implemented to take pieces of the code by manually entering the location of the training data chunks.…

Auto
- 59
- 1
- 1
- 6
-1
votes
1 answer
LLM's answering out of context ( trained on user data)
I have trained LLM on my PDF file now I am asking questions related to same, but if a question is being asked out of the context I want the answer as " I don't know " or " out of context "
Right now it is answering even out of context
I have used…

Mukilan
- 1
- 1
-2
votes
1 answer
Building a Closed-Domain Legal Language Model with LLaMA 2 7B: Pretraining vs. Finetuning, Optimization Strategies, and Feasibility
I'm attempting to build a closed-domain language model specifically tailored to legal services, essentially emulating a standalone lawyer. My approach involves pretraining the LLaMA 2 7B model, focusing only on the legal domain, and then fine-tuning…
-2
votes
2 answers
AI: Create a domain-specific LLM
Apologies in advance for a very broad question. I have a friend who works as a grant writer and she has a corpus of successful grant proposals. Is there a way to easily create a domain-specific LLM that trains off of these proposals for the creation…

lispquestions
- 431
- 3
- 12
-3
votes
0 answers
Can i generate reinforced questions based on user response?
Input:"I am having cough and a runny nose"
bot: " How long have you had the cough? (days/weeks/months):?
Input:"From last one week."
bot:" Is the cough dry or productive? (dry/productive)
Input:" Cough is productive"
bot:"Does the cough sound…
-3
votes
0 answers
Can QLORAs on StableBeluga-7B learn a personality?
I'm building a repository of QLORA adapters that change the model's personality. The end vision is a hub of ready-to-go personality adapters.
I'm hitting a snag when training the QLORAs for Paul Graham's personality on top of a 4-bit quantized…
-3
votes
0 answers
Loading a local LLM with HuggingFace and running SQL queries
I am trying to create a SQL query LLM. Does anyone know how to use HuggingFacePipeline.from_pretrained to load a locally stored LLM model. The from_pretrained is not working with HuggingFace, as in the method does not exist.
--------- Adding…

RaptorX
- 113
- 10