Questions tagged [large-language-model]

Use this tag for questions about large language models (LLM), trained deep-learning artificial intelligence algorithms that interpret and generate natural language text.

118 questions
0
votes
0 answers

Deepspeed on dolly-7B not using all GPUs while inferencing

I followed their official tutorial for inferencing using Deepspeed However, I keep getting CUDA OOM error. When I check GPU usage, it seems, instead of consuming 4 available 24Gi GPUs, it is using single GPU. To reproduce code: generator =…
Shirish Bajpai
  • 608
  • 1
  • 5
  • 16
0
votes
0 answers

Loading Megatron NLP Pretrained Model and Training it with my own data. Errors

I am getting errors. My most recent one being: ImportError: cannot import name 'LightningDistributedModule' from 'pytorch_lightning.overrides'. I'm trying to load a pre-trained model and then teach it with other files. I have the links to these…
0
votes
0 answers

How to prepare dataset with multiple answers for single question to train the GPT3/davinci model

I am trying to fine-tune the GPT model, and for that, I have 3 columns: context, question, and answer. but I have multiple answers to a question. I have repeated question text for multiple answers, what is the best way to prepare an optimized…
0
votes
2 answers

GPT3 : from next word to Sentiment analysis, Dialogs, Summary, Translation ....?

How does GPT3 or other model goes from next word prediction to do Sentiment analysis, Dialogs, Summaries, Translation .... ? what is the idea and algorithms ? How does it work ? F.e. generating paragraph is generate next word then the next…
sten
  • 7,028
  • 9
  • 41
  • 63
-1
votes
1 answer

Can BERT or LLM be used for sentence - word recommendation?

I'm junior data analyst. I'm looking for method for Sentence-> word recommendation. For example, if I input 'the little mermaid' and book's introduction(sentence), the model can put out 'swim suit' or 'fish doll'. My knowledge about NLP is beginner…
-1
votes
0 answers

A LLM embeds my full project and continuously help me develop

TL;DR What about embedding your whole repository and ask GPT to modify a part of it, reflecting changes each time? Is there already a repo for doing it? **I searched auto-gpt, but it is just automating the whole code generation process. It just…
-1
votes
1 answer

Large Language Model Runs out of memory no matter which Hyperparameters I change - GeForce GTX 3060Ti

I have been trying to fix this for a few days. The problem is that it runs out of memory because my training data is quite large. I have a system implemented to take pieces of the code by manually entering the location of the training data chunks.…
Auto
  • 59
  • 1
  • 1
  • 6
-1
votes
1 answer

LLM's answering out of context ( trained on user data)

I have trained LLM on my PDF file now I am asking questions related to same, but if a question is being asked out of the context I want the answer as " I don't know " or " out of context " Right now it is answering even out of context I have used…
-2
votes
1 answer

Building a Closed-Domain Legal Language Model with LLaMA 2 7B: Pretraining vs. Finetuning, Optimization Strategies, and Feasibility

I'm attempting to build a closed-domain language model specifically tailored to legal services, essentially emulating a standalone lawyer. My approach involves pretraining the LLaMA 2 7B model, focusing only on the legal domain, and then fine-tuning…
-2
votes
2 answers

AI: Create a domain-specific LLM

Apologies in advance for a very broad question. I have a friend who works as a grant writer and she has a corpus of successful grant proposals. Is there a way to easily create a domain-specific LLM that trains off of these proposals for the creation…
lispquestions
  • 431
  • 3
  • 12
-3
votes
0 answers

Can i generate reinforced questions based on user response?

Input:"I am having cough and a runny nose" bot: " How long have you had the cough? (days/weeks/months):? Input:"From last one week." bot:" Is the cough dry or productive? (dry/productive) Input:" Cough is productive" bot:"Does the cough sound…
-3
votes
0 answers

Can QLORAs on StableBeluga-7B learn a personality?

I'm building a repository of QLORA adapters that change the model's personality. The end vision is a hub of ready-to-go personality adapters. I'm hitting a snag when training the QLORAs for Paul Graham's personality on top of a 4-bit quantized…
-3
votes
0 answers

Loading a local LLM with HuggingFace and running SQL queries

I am trying to create a SQL query LLM. Does anyone know how to use HuggingFacePipeline.from_pretrained to load a locally stored LLM model. The from_pretrained is not working with HuggingFace, as in the method does not exist. --------- Adding…
1 2 3 4 5 6 7
8