Questions tagged [llm]

A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)

A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)

A large language model is characterized by its large size. Their AI accellerator networks are able to process huge amounts of text data, usually scraped from the internet.

200 questions
2
votes
1 answer

Suppress LLamaCpp stats output

How can I suppress LLamaCpp stats output in Langchain ... equivalent code : llm = LlamaCpp(model_path=..., ....) llm('who is Caesar') > who is Caesar ? Julius Caesar was a Roman general and statesman who played a critical role in the events that…
sten
  • 7,028
  • 9
  • 41
  • 63
2
votes
0 answers

LLaMA: reached the end of the context window so resizing

I'm currently working on a project where I'm using the LLaMA library for natural language processing tasks. However, I've encountered an error message that I'm struggling to resolve. The error states: "LLaMA: reached the end of the context window so…
ReFreeman
  • 33
  • 3
2
votes
1 answer

Finetuning Open LLMs

I am a newbie trying to learn fine tuning. Started with falcon 7B instruct LLM as my base LLM and want to fine tune this with open assistant instruct dataset. I have 2080 Ti with 11G VRAM. So I am using 4 bit quantization and Lora. These are the…
codemugal
  • 81
  • 1
  • 1
  • 4
2
votes
2 answers

add memory to create_pandas_dataframe_agent in Langchain

I am trying to add memory to create_pandas_dataframe_agent to perform post processing on a model that I trained using Langchain. I am using the following code at the moment. from langchain.llms import OpenAI import pandas as pd df =…
Matt
  • 85
  • 6
2
votes
0 answers

Using OpenAI LLMs for classification. Asking for classification vs. asking for probabilities

I'm using LLMs for classifying products into specific categories. Multi-Class. One way to do it would it to ask if it's a yes/no for a specific category and loop through the categories. Another way would be to ask for a probability that that…
2
votes
1 answer

GGML (llama cpp) models become dumb when used in python

I am struggling with the issue of models not following instructions at all when they are used in Python, however, they work much better when they are used in a shell (like cmd, or powershell). python examples: Question: llm("Can you solve math…
2
votes
0 answers

Improving the performance of aQuestion answering, BERT and GPT, predicting without GPU

I downloaded a python script which does question answering using BERT and GPT, unfortunately this script requires a GPU for it's prediction and when ran using a GPU takes only 1 sec per question, but when ran using CPU takes more than 3 minutes per…
1
vote
1 answer

Using langchain for text to SQL using custom llm API

I am trying to use my llama2 model (exposed as an API using ollama). I want to chat with the llama agent and query my Postgres db (i.e. generate text to sql). I was able to find langchain code that uses open AI to do this. However, I am unable to…
A_K
  • 81
  • 2
  • 10
1
vote
1 answer

create_csv_agent with HuggingFace: could not parse LLM output

I am using Langchain and applying create_csv_agent on a small csv dataset to see how well can google/flan-t5-xxl query answers from tabular data. As of now, I am experiencing the problem of ' OutputParserException: Could not parse LLM output: `0`' >…
1
vote
1 answer

Very slow Response from LLM based Q/A query engine

I built a Q/A query bot over a 4MB csv file I have in my local, I'm using chroma for vector DB creation and with embedding model being Instructor Large from hugging face, and LLM chat model being LlamaCPP=llama2-13b-chat, The Vector Database created…
1
vote
0 answers

Save a LLM model after adding RAG pipeline and embedded model and deploy as hugging face inference?

I have created a RAG (Retrieval-augmented generation) pipeline and using it with a 4-bit quantized openllama 13b loaded directly from hugging face and without fine-tuning the model. At first I need to save the model into local. But after using…
No Flag
  • 11
  • 3
1
vote
2 answers

RecursiveCharacterTextSplitter of Langchain doesn't exist

I am trying to do a text chunking by LangChain's RecursiveCharacterTextSplitter model. I have install langchain(pip install langchain[all]), but the program still report there is no RecursiveCharacterTextSplitter package. I use from…
1
vote
1 answer

Sentence embeddings from LLAMA 2 Huggingface opensource

Could anyone let me know if there is any way of getting sentence embeddings from meta-llama/Llama-2-13b-chat-hf from huggingface? Model link: https://huggingface.co/meta-llama/Llama-2-13b-chat-hf I tried using transfomer.Automodel module from…
1
vote
1 answer

How to extract sub-string from Haystack's print_answers

I was following this tutorial from pinecone.io about using Haystack's print_answers And as you can see in the later part of the tutorial, the output carries a lot of string. These string like output is not subscript-able and thus I'm not able to…
1
vote
0 answers

Llama.generate: prefix-match hit

I am using "llama-2-7b-chat.ggmlv3.q2_K.bin" (from hugging-face) using "LlamaCpp()" in langchain. The process of "Llama.generate: prefix-match hit" repeats itself so many times and answers itself. But I want answer only once. How can I set this to…
1
2
3
13 14