Questions tagged [language-model]

266 questions

votes

2 answers

Negative results using kenlm

I am new to the language modeling and a make a 3grams language model using kenlm(or this) from a large text file (~7gb.). I make a binary file from my language model and call it in python like this: import kenlm model = kenlm.LanguageModel(

python language-model

asked Nov 15 '16 at 10:33

Emad Helmi

votes

1 answer

unable to open Cube language model params for hindi Language in tesseract

Tesseract unable to read cube language model. tesseract 1.png output.txt -l hin After above command execution following error occur. Cube ERROR (CubeRecoContext::Load): unable to read cube language model params from…

ocr tesseract hindi language-model

asked Feb 15 '16 at 06:47

Madhav Nikam

votes

0 answers

How is scaled_dot_product_attention meant to be used with cached keys/values in causal LM?

I'm implementing a transformer and I have everything working, including attention using the new scaled_dot_product_attention from PyTorch 2.0. I'll only be doing causal attention, however, so it seems like it makes sense to use the is_causal=True…

pytorch language-model self-attention

asked May 04 '23 at 20:39

turboderp

votes

1 answer

Finetuning a LM vs prompt-engineering an LLM

Is it possible to finetune a much smaller language model like Roberta on say, a customer service dataset and get results as good as one might get with prompting GPT-4 with parts of the dataset? Can a fine-tuned Roberta model learn to follow…

language-model roberta-language-model roberta gpt-4 large-language-model

asked Apr 18 '23 at 20:15

Tolu

1,081
1
8
23

votes

2 answers

About BertForMaskedLM

I have recently read about Bert and want to use BertForMaskedLM for fill_mask task. I know about Bert architecture. Also, as far as I know, BertForMaskedLM is built from Bert with a language modeling head on top, but I have no idea about what…

nlp bert-language-model huggingface-transformers language-model

asked Apr 14 '21 at 18:48

Đặng Huy

votes

2 answers

Huggingface Transformer - GPT2 resume training from saved checkpoint

Resuming the GPT2 finetuning, implemented from run_clm.py Does GPT2 huggingface has a parameter to resume the training from the saved checkpoint, instead training again from the beginning? Suppose the python notebook crashes while training, the…

python pytorch huggingface-transformers language-model gpt-2

asked Jan 01 '21 at 11:07

Woody

votes

0 answers

HuggingFace Trainer Segmentation Fault

Huggingface Trainer keeps giving Segmentation Fault with this setup code. The dataset is around 600MB, and the server has 2*32GB Nvidia V100. Can anyone help find the issue? from transformers import Trainer, TrainingArguments,…

python machine-learning huggingface-transformers language-model

asked Jul 07 '20 at 10:06

efe23eds

votes

0 answers

What are some techniques to improve contextual accuracy of semantic search engine using BERT?

I am implementing a semantic search engine using BERT (using cosine distance) To a certain extend the method is able to find out sentences in a high level context. However when it comes narrowed down context of the sentence, it gives several…

semantic-web word-embedding language-model bert-language-model

asked May 19 '20 at 18:10

buddy

votes

1 answer

fastai: ValueError: len() should return >= 0

While running the following program - https://rawgit.com/sizhky/eef1482e63387df8e9e045ac1e5a0ce8/raw/bdbebafaab21739a27f6bf32e83da1557919b44b/lm.html I'm unable to call learner.fit as it throws the above error. Specifically, I'm trying to train a…

deep-learning language-model

asked Jul 04 '18 at 09:50

Yesh

votes

1 answer

Keras shape error when checking input

I am trying to train a simple MLP model that maps input questions (using a 300D word embedding) and image features extracted using a pretrained VGG16 model to a feature vector of fixed length. However, I can't figure out how to fix the error…

neural-network keras language-model

asked Jun 29 '18 at 02:57

Vanessa.C

votes

1 answer

Correct way to calculate probabilities using ARPA LM data

I am writing a small library for calculating ngram probabilities. I have a LM described by arpa file (its a quite simple format: probability ngram backoff_weight): ... -5.1090264 Hello -0.05108307 -5.1090264 Bob -0.05108307 -3.748848 we…

nlp n-gram language-model

asked Sep 20 '17 at 17:38

Bob

5,809
5
36
53

votes

1 answer

Error at ARPA model training with SRILM

I have followed this tutorial. After I run this code: ngram-count -kndiscount -interpolate -text train-text.txt -lm your.lm It gives me this error: "One of modified KneserNey discounts is negative error in discount estimator for order 2." How…

speech-recognition cmusphinx sphinx4 language-model srilm

asked Jul 20 '16 at 14:09

ziLk

3,120
21
45

votes

1 answer

Input shape for Keras LSTM/GRU language model

I am trying to train a language model on word level in Keras. I have my X and Y, both with the shape (90582L, 517L) When I try fit this model: print('Build model...') model = Sequential() model.add(GRU(512, return_sequences=True,…

python nlp keras lstm language-model

asked Jul 07 '16 at 08:48

ishido

4,065
9
32
42

votes

0 answers

Testing accuracy always more than 99%

I am trying to implement a language model using LSTMs in theano/keras. My network runs fine and I also see that the training loss decreases but the testing accuracy is always above 99% even if I don not train my network for long. I have used…

theano keras lstm language-model

asked Mar 03 '16 at 02:02

Rudra Pratap Singh

votes

2 answers

command line parameter in word2vec

I want to use word2vec to create my own word vector corpus with the current version of the english wikipedia, but I can't find an explanation of the command line parameter for using that program. In the demp-script you can find following: (text8 is…

nlp word2vec language-model

asked Jun 08 '15 at 13:14

Rainflow

Prev 1 2

…

17 18 Next