Questions tagged [language-model]

266 questions
4
votes
2 answers

Negative results using kenlm

I am new to the language modeling and a make a 3grams language model using kenlm(or this) from a large text file (~7gb.). I make a binary file from my language model and call it in python like this: import kenlm model = kenlm.LanguageModel(
Emad Helmi
  • 75
  • 5
4
votes
1 answer

unable to open Cube language model params for hindi Language in tesseract

Tesseract unable to read cube language model. tesseract 1.png output.txt -l hin After above command execution following error occur. Cube ERROR (CubeRecoContext::Load): unable to read cube language model params from…
Madhav Nikam
  • 153
  • 2
  • 9
3
votes
0 answers

How is scaled_dot_product_attention meant to be used with cached keys/values in causal LM?

I'm implementing a transformer and I have everything working, including attention using the new scaled_dot_product_attention from PyTorch 2.0. I'll only be doing causal attention, however, so it seems like it makes sense to use the is_causal=True…
turboderp
  • 31
  • 2
3
votes
1 answer

Finetuning a LM vs prompt-engineering an LLM

Is it possible to finetune a much smaller language model like Roberta on say, a customer service dataset and get results as good as one might get with prompting GPT-4 with parts of the dataset? Can a fine-tuned Roberta model learn to follow…
3
votes
2 answers

About BertForMaskedLM

I have recently read about Bert and want to use BertForMaskedLM for fill_mask task. I know about Bert architecture. Also, as far as I know, BertForMaskedLM is built from Bert with a language modeling head on top, but I have no idea about what…
3
votes
2 answers

Huggingface Transformer - GPT2 resume training from saved checkpoint

Resuming the GPT2 finetuning, implemented from run_clm.py Does GPT2 huggingface has a parameter to resume the training from the saved checkpoint, instead training again from the beginning? Suppose the python notebook crashes while training, the…
3
votes
0 answers

HuggingFace Trainer Segmentation Fault

Huggingface Trainer keeps giving Segmentation Fault with this setup code. The dataset is around 600MB, and the server has 2*32GB Nvidia V100. Can anyone help find the issue? from transformers import Trainer, TrainingArguments,…
3
votes
0 answers

What are some techniques to improve contextual accuracy of semantic search engine using BERT?

I am implementing a semantic search engine using BERT (using cosine distance) To a certain extend the method is able to find out sentences in a high level context. However when it comes narrowed down context of the sentence, it gives several…
3
votes
1 answer

fastai: ValueError: __len__() should return >= 0

While running the following program - https://rawgit.com/sizhky/eef1482e63387df8e9e045ac1e5a0ce8/raw/bdbebafaab21739a27f6bf32e83da1557919b44b/lm.html I'm unable to call learner.fit as it throws the above error. Specifically, I'm trying to train a…
Yesh
  • 976
  • 12
  • 15
3
votes
1 answer

Keras shape error when checking input

I am trying to train a simple MLP model that maps input questions (using a 300D word embedding) and image features extracted using a pretrained VGG16 model to a feature vector of fixed length. However, I can't figure out how to fix the error…
Vanessa.C
  • 45
  • 1
  • 5
3
votes
1 answer

Correct way to calculate probabilities using ARPA LM data

I am writing a small library for calculating ngram probabilities. I have a LM described by arpa file (its a quite simple format: probability ngram backoff_weight): ... -5.1090264 Hello -0.05108307 -5.1090264 Bob -0.05108307 -3.748848 we…
Bob
  • 5,809
  • 5
  • 36
  • 53
3
votes
1 answer

Error at ARPA model training with SRILM

I have followed this tutorial. After I run this code: ngram-count -kndiscount -interpolate -text train-text.txt -lm your.lm It gives me this error: "One of modified KneserNey discounts is negative error in discount estimator for order 2." How…
ziLk
  • 3,120
  • 21
  • 45
3
votes
1 answer

Input shape for Keras LSTM/GRU language model

I am trying to train a language model on word level in Keras. I have my X and Y, both with the shape (90582L, 517L) When I try fit this model: print('Build model...') model = Sequential() model.add(GRU(512, return_sequences=True,…
ishido
  • 4,065
  • 9
  • 32
  • 42
3
votes
0 answers

Testing accuracy always more than 99%

I am trying to implement a language model using LSTMs in theano/keras. My network runs fine and I also see that the training loss decreases but the testing accuracy is always above 99% even if I don not train my network for long. I have used…
3
votes
2 answers

command line parameter in word2vec

I want to use word2vec to create my own word vector corpus with the current version of the english wikipedia, but I can't find an explanation of the command line parameter for using that program. In the demp-script you can find following: (text8 is…
Rainflow
  • 161
  • 1
  • 2
  • 5
1 2
3
17 18