Questions tagged [language-model]
266 questions
4
votes
2 answers
Negative results using kenlm
I am new to the language modeling and a make a 3grams language model using kenlm(or this) from a large text file (~7gb.).
I make a binary file from my language model and call it in python like this:
import kenlm
model = kenlm.LanguageModel(

Emad Helmi
- 75
- 5
4
votes
1 answer
unable to open Cube language model params for hindi Language in tesseract
Tesseract unable to read cube language model.
tesseract 1.png output.txt -l hin
After above command execution following error occur.
Cube ERROR (CubeRecoContext::Load): unable to read cube language model params from…

Madhav Nikam
- 153
- 2
- 9
3
votes
0 answers
How is scaled_dot_product_attention meant to be used with cached keys/values in causal LM?
I'm implementing a transformer and I have everything working, including attention using the new scaled_dot_product_attention from PyTorch 2.0. I'll only be doing causal attention, however, so it seems like it makes sense to use the is_causal=True…

turboderp
- 31
- 2
3
votes
1 answer
Finetuning a LM vs prompt-engineering an LLM
Is it possible to finetune a much smaller language model like Roberta on say, a customer service dataset and get results as good as one might get with prompting GPT-4 with parts of the dataset?
Can a fine-tuned Roberta model learn to follow…

Tolu
- 1,081
- 1
- 8
- 23
3
votes
2 answers
About BertForMaskedLM
I have recently read about Bert and want to use BertForMaskedLM for fill_mask task. I know about Bert architecture. Also, as far as I know, BertForMaskedLM is built from Bert with a language modeling head on top, but I have no idea about what…

Đặng Huy
- 31
- 2
- 3
3
votes
2 answers
Huggingface Transformer - GPT2 resume training from saved checkpoint
Resuming the GPT2 finetuning, implemented from run_clm.py
Does GPT2 huggingface has a parameter to resume the training from the saved checkpoint, instead training again from the beginning? Suppose the python notebook crashes while training, the…

Woody
- 930
- 9
- 23
3
votes
0 answers
HuggingFace Trainer Segmentation Fault
Huggingface Trainer keeps giving Segmentation Fault with this setup code.
The dataset is around 600MB, and the server has 2*32GB Nvidia V100. Can anyone help find the issue?
from transformers import Trainer, TrainingArguments,…

efe23eds
- 51
- 4
3
votes
0 answers
What are some techniques to improve contextual accuracy of semantic search engine using BERT?
I am implementing a semantic search engine using BERT (using cosine distance) To a certain extend the method is able to find out sentences in a high level context. However when it comes narrowed down context of the sentence, it gives several…

buddy
- 189
- 2
- 16
3
votes
1 answer
fastai: ValueError: __len__() should return >= 0
While running the following program - https://rawgit.com/sizhky/eef1482e63387df8e9e045ac1e5a0ce8/raw/bdbebafaab21739a27f6bf32e83da1557919b44b/lm.html
I'm unable to call learner.fit as it throws the above error.
Specifically,
I'm trying to train a…

Yesh
- 976
- 12
- 15
3
votes
1 answer
Keras shape error when checking input
I am trying to train a simple MLP model that maps input questions (using a 300D word embedding) and image features extracted using a pretrained VGG16 model to a feature vector of fixed length. However, I can't figure out how to fix the error…

Vanessa.C
- 45
- 1
- 5
3
votes
1 answer
Correct way to calculate probabilities using ARPA LM data
I am writing a small library for calculating ngram probabilities.
I have a LM described by arpa file (its a quite simple format: probability ngram backoff_weight):
...
-5.1090264 Hello -0.05108307
-5.1090264 Bob -0.05108307
-3.748848 we…

Bob
- 5,809
- 5
- 36
- 53
3
votes
1 answer
Error at ARPA model training with SRILM
I have followed this tutorial.
After I run this code:
ngram-count -kndiscount -interpolate -text train-text.txt -lm your.lm
It gives me this error:
"One of modified KneserNey discounts is negative error in discount
estimator for order 2."
How…

ziLk
- 3,120
- 21
- 45
3
votes
1 answer
Input shape for Keras LSTM/GRU language model
I am trying to train a language model on word level in Keras.
I have my X and Y, both with the shape (90582L, 517L)
When I try fit this model:
print('Build model...')
model = Sequential()
model.add(GRU(512, return_sequences=True,…

ishido
- 4,065
- 9
- 32
- 42
3
votes
0 answers
Testing accuracy always more than 99%
I am trying to implement a language model using LSTMs in theano/keras. My network runs fine and I also see that the training loss decreases but the testing accuracy is always above 99% even if I don not train my network for long.
I have used…

Rudra Pratap Singh
- 31
- 2
3
votes
2 answers
command line parameter in word2vec
I want to use word2vec to create my own word vector corpus with the current version of the english wikipedia, but I can't find an explanation of the command line parameter for using that program. In the demp-script you can find following:
(text8 is…

Rainflow
- 161
- 1
- 2
- 5