Highest Voted 'large-language-model' Questions

1

vote

0 answers

How to restrict out of context search in LangChain

i want to restrict query search limited to custom documents for LLM . but its showing out of context results as well as shown in below image. My code is below: for token generation max_input_size = 4096 num_outputs = 512 max_chunk_overlap =…

python openai-api large-language-model

asked Apr 04 '23 at 12:22

Shubh kumar

39
5

1

vote

1 answer

Why does LLM(LLaMA) loss drop staircase-like over epochs?

I'm training a LLM(LLaMA-6B) and have noticed that its loss seems to drop in a stair-like fashion over the epochs. Specifically, I'll see little loss change for one epoch, and then suddenly the loss will drop quite a bit after a new epoch. I'm…

loss gpt-3 fine-tune large-language-model llama-index

asked Mar 28 '23 at 13:05

Jing zhao

11
1

1

vote

1 answer

How does Huggingface's zero-shot classification work in production/webapp, do I need to train the model first?

I have already used huggingface's zero-shot classification: I used "facebook/bart-large-mnli" model as reported here (https://huggingface.co/tasks/zero-shot-classification). The accuracy is quite good for my task. My question is about…

python huggingface-transformers text-classification large-language-model zeroshot-classification

asked Mar 28 '23 at 12:10

Franz

45
1
7

1

vote

1 answer

How many neurons (units) are there in the BERT model?

How to estimate the number of neurons (units) in the BERT model? Note this is different from the number of model parameters.

pytorch huggingface-transformers bert-language-model pre-trained-model large-language-model

asked Mar 25 '23 at 20:08

Celso França

653
8
31

1

vote

1 answer

Why did the bart-large-cnn summarization model giving funny output with different length settings?

I have a piece of text of 4226 characters (316 words + special characters) I am trying different combinations of min_length and max_length to get summary print(summarizer(INPUT, max_length = 1000, min_length=500, do_sample=False)) With the…

python nlp huggingface-transformers summarization large-language-model

asked Mar 20 '23 at 21:26

Ani

265
1
3
10

0

votes

0 answers

Cryptic CUDA error when fine-tuning a sequence classification model

I am working on fine-tuning Llama 2 7B for sequence classification using QLoRA. I am using a single A100 GPU and get the same cryptic CUDA error even when increasing to multiple GPUs, increasing CPU memory, and using a batch size of 1. This is the…

python text-classification large-language-model

asked Aug 23 '23 at 17:20

bryanchrist

1

0

votes

1 answer

How to directly load fine-tuned model like Alpaca-Lora (PeftModel()) from the local files instead of load it from huggingface models?

I have finetuned Llama model using low-rank adaptation (LoRA), based on peft package. The result files adapter_config.json and adapter_model.bin are saved. I can load fine-tuned model from huggingface by using the following codes: model =…

huggingface llm large-language-model peft

asked Aug 23 '23 at 16:46

a7777777

1
1

0

votes

1 answer

Getting Peft Version Error while Autotrain Finetune on Llama 2

i did some Llama 2 finetuning with autotrain, on google colab. this is a sample text column, for fine tuning ###Human: Here is the OCR Text extracted from a VHS tape cover. Yes, the text is surely extracted from a VHS tape, but it may have some…

huggingface large-language-model llama fine-tuning

asked Aug 20 '23 at 17:30

SoajanII

323
5
19

0

votes

0 answers

LLM token embeddings

Hi im just getting started with undertsanding transformer based models and I am not able to find how the token embeddings are arrived at?. there are multiple tokenization approaches and multiple vocabularies/documents llms are trained on. so my…

huggingface-tokenizers llm large-language-model fine-tuning

asked Aug 19 '23 at 11:47

dasman

237
1
2
10

0

votes

0 answers

Questions about distributed finetuning of transformers model (chatglm) with Accelerate in Kaggle GPUs

I am trying to finetune the chatglm-6b model using LoRA with transformers and peft in Kaggle GPUs (2*T4). The model structure: The traditional loading method (AutoModel.from_pretrained) needs to load the model itself (15 GB) onto CPU first, whereas…

huggingface-transformers kaggle large-language-model peft fine-tuning

asked Aug 15 '23 at 07:58

LocustNymph

11
3

0

votes

0 answers

How to load the finetuned model (merged weights) on colab?

I have finetuned the llama2 model. Reloaded the base model and merged the LoRA weights. I again saved this finally loaded model and now I intend to run it. base_model = AutoModelForCausalLM.from_pretrained( model_name, …

huggingface-transformers llm large-language-model llama peft

asked Aug 13 '23 at 05:02

Gaurav Gupta

4,586
4
39
72

0

votes

0 answers

Get the positive score in a classification task by using a generative model

I'm attempting to utilize a generative model (Llama2) for a binary classification task and aim to obtain the positive score, which represents the confidence level for the positive label. I tried to use compute_transition_scores but not sure how can…

python nlp huggingface-transformers large-language-model

asked Aug 12 '23 at 09:39

Ofir

590
9
19

0

votes

0 answers

Speed up llm in LangChain

My project is to do a search engine based natural language. I don't use eGPU nor M1/M2 Here is a part of my code import os from typing import Any, List # from llm import CustomLLM from langchain.chains import RetrievalQA from…

python langchain llm large-language-model

asked Aug 11 '23 at 17:41

Faulheit

116
7

0

votes

1 answer

How to solve AssertionError when loading LLaMa 2 70B with Google Colab?

I am trying to run LLaMa 2 70B in Google Colab, using a GGML file: TheBloke/Llama-2-70B-Chat-GGML. Here is my current code that I am using to run it: !pip install huggingface_hub model_name_or_path = "TheBloke/Llama-2-70B-Chat-GGML" model_basename =…

google-colaboratory large-language-model

asked Aug 11 '23 at 01:35

Hoang Cuong Nguyen

329
2
11

0

votes

1 answer

Langchain: Custom Output Parser not working with ConversationChain

I am creating a chatbot with langchain's ConversationChain, thus, it needs conversation memory. However, at the end of each of its response, it makes a new line and writes a bunch of gibberish. Thus, I created my custom output parser to remove this…

python nlp langchain llm large-language-model

asked Aug 10 '23 at 16:17

Z S

3
1

Questions tagged [large-language-model]