Questions tagged [llama]

LLaMA (Large Language Model Meta AI) is a large language model (LLM) released by Meta AI.

LLaMA (Large Language Model Meta AI) is a large language model (LLM) released by Meta AI.

55 questions
0
votes
0 answers

Feasibility of using Falcon/Falcoder/Llama2 LLM while trying to use it on AWS EC2 Inferentia 2.8xlarge and G4dn.8xLarge Instances

Is it possible to do inference on the aforementioned machines as we are facing so many issues in Inf2 with Falcon model? Context: We are facing issues while using Falcon/Falcoder on the Inf2.8xl machine. We were able to run the same experiment on…
Amlan
  • 1
  • 1
0
votes
0 answers

How should I resolve this error in LLaMA:TypeError: __init__() got an unexpected keyword argument 'quantizer' ?

When I was running the LLaMA code, I encountered this error:TypeError: init() got an unexpected keyword argument 'quantizer', and I don't know how to resolve it. I have checked the version compatibility. Please help me come up with possible…
0
votes
0 answers

how to specify temperature and max_new_tokens in the curl request to Llama 2 in Huggingface Inference Endpoint?

I'm new to AI, so apologies if wrong terminology used here. I'm extracting some information from a body of text, and have setup Llama 2 in Huggingface via their Inference Endpoint so I can call it via curl. The curl works for short inputs and…
Magnus
  • 10,736
  • 5
  • 44
  • 57
0
votes
1 answer

Finetune LlaMA 7B model using Pytorch Lightning Framework

Need Expert help to solve this issue. LLaMA 7B model for sentiment classification with instructional Finetuning. import torch import torch.nn as nn from torch.utils.data import Dataset, DataLoader from transformers import LlamaTokenizer,…
wahid
  • 1
  • 1
-1
votes
0 answers

how to make a required output when finetuning a llama2-7b-chat model on GSM8K?

I am finetuning llama2-7b-chat on GSM8K, a dataset concudes about 8k expamples of graduate school math problems. e.g. "question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did…
-1
votes
1 answer

I want to deploy LLM model on Sagemaker and it is giving me this error. I've tried with different models as well but still facing same error

I'm deploying TheBloke/Llama-2-7b-Chat-GPTQ " model on sagemaker. I'm running this code in sagemaker notebook instance. I've used "ml.g4dn.xlarge" instance for deployement. I've used the same code that have been shown on the deployment on Amazon…
-2
votes
0 answers

torch.cuda.OutOfMemoryError: CUDA out of memory

When I run the fine-tuned llama model with lora to generate results using 1 GPU, this error happened torch.cuda.OutOfMemoryError: CUDA out of memory. My code: test_data = Dataset.from_list(torch.load(test_dataset_dir)) tokenizer =…
a7777777
  • 1
  • 1
-2
votes
0 answers

Will Inconsistent Alternation of Responses Affect Fine-Tuning LLAMA2 with Chat History

I am working on fine-tuning LLAMA2 with a dataset containing chat history. While preparing the data, I've noticed that the dialogue doesn't always follow a pattern of alternating responses between speakers. In some cases, one person responds several…
Ivo Oostwegel
  • 374
  • 2
  • 20
-3
votes
0 answers

Can QLORAs on StableBeluga-7B learn a personality?

I'm building a repository of QLORA adapters that change the model's personality. The end vision is a hub of ready-to-go personality adapters. I'm hitting a snag when training the QLORAs for Paul Graham's personality on top of a 4-bit quantized…
-4
votes
0 answers

How to Delete GPT Models, Managing Storage Usage for Installed GPT Models and Packages

I have installed several Generative Pretrained Transformer (GPT) models on my local system for fine-tuning purposes, both within Python in Visual Studio Code and via the Command Prompt window during code execution. The installed models include…
KARTHIK K
  • 1
  • 1
1 2 3
4