Questions tagged [language-model]
266 questions
0
votes
0 answers
Masked language modeling - masked token/embedding clarification
When Transformers are trained with masked language modeling (or masked image modelling), the input embeddings at masked positions are replaced with a MASK token/learnable mask embedding. I'm wondering how these mask embeddings work - how does the…

clueless
- 211
- 2
- 3
- 7
0
votes
0 answers
Creating a language model from scratch with spaCy with POS-tagged corpus and word embeddings
I am trying to build and train a new language in spaCy from scratch, but I am struggling with how to configure spaCy for the initial training. Some notes on current resources:
I already have word embeddings from a corpus of around 150 million…

jlrl
- 11
- 2
0
votes
0 answers
Endless loop in a text generation script
I am trying to make a simple text generator using the Bulgarian language but my code is stuck in an endless loop. Here is the code:
from tokenization import tokenize_bulgarian_text
from nltk import bigrams, trigrams
from collections import Counter,…

mark-de
- 1
- 1
0
votes
0 answers
Is it possible to mask only part of an embedding during masked 'language' modeling?
I'm using Transformers to process time-series data. Each X second time window of data (from S sensors) is embedded into F features before being inputted to the Transformer. Each F/S span of the embedding corresponds to features from one sensor's…

clueless
- 211
- 2
- 3
- 7
0
votes
0 answers
Finetuning BERT vs BERT + Spacy for text classification
I am a bit confused on the difference between/advantage of finetuning BERT or other LLMs for text classification instead of just using the BERT embeddings with a Spacy pipeline.
I believe by using the Spacy pipeline, speed and flexibility (different…

kbmmoran
- 1
- 1
0
votes
0 answers
Using BERT to generate technical skills from a set of activities just outputs the input data
I'm trying to use jobspanbert to generate technical IT skills from a column of job_activities, which is textual data describing the activities the employee does at his job.
The model ran for 2 hours straight so you can imagine how excited I was to…

Moe_blg
- 71
- 4
0
votes
0 answers
Laplace Smoothing - Greater perplexity of model language when increase the N of N-Gram Model
I'm training a Language Model using NLTK library of Python.
To obtain a better result, I use the Laplace smoothing technique.
But when I increase the N of N-gram model, my perplexity increases too, and I was expecting that the perplexity would…

Leticia
- 1
- 1
0
votes
0 answers
Why do I need to add --discount_fallback?
I have simple English file:
I'm Harry Potter
Harry Potter is young wizard
Hermione Granger is Harry friend
There are seven fantasy novels of Harry Potter
I'm running the following command:
lmplz -o 3 myTest.arpa
And getting…

user3668129
- 4,318
- 6
- 45
- 87
0
votes
0 answers
Inferring a large language model on a GPU with not enough video RAM
I'm trying some experiments running downloaded language models on a desktop machine. Specifically so far Bloom 3B and 7B on a machine with 32GB RAM, a 2-core CPU and no GPU.
(Throughout this question, I will be talking only about inferring –…

rwallace
- 31,405
- 40
- 123
- 242
0
votes
0 answers
How to train a language model for my data
I have a dataset of IDs that are meaningful to me.
I want to use language models to generate IDs based on a few IDs that I give as a starting point.
Let's say my dataset is like a sequence of IDs in each line separated by whitespace, more…

Ali
- 96
- 6
0
votes
1 answer
I want to make an AI text classifier using OpenAI API, based on GPT2 but i cannot find the API documentation for the GPT2
I wanted to create an AI text classifier project for my college, I wanted to use GPT2 API for the same as it is more reliable to catch the content generated by GPT 3.5, so how can I use GPT2 documentation? also any useful resources for the same are…

golusharma
- 11
- 2
0
votes
0 answers
OutOfMemoryError when I create model embeddings
Just started learning huggingface transformers. I am trying to create embeddings of a large amount of text but I always run into outOfMemoryErrors. I am not sure what I am doing wrong. I am new to python and transformers. Here is my code…
0
votes
0 answers
Supervised fine tuning in pre-trained language model
Supervised find turning adds a extra output layer to the pre-trained model.
Does this extra layer alter the probability of words that are not related to the fine tune data?

Chen APD
- 1
0
votes
1 answer
How to use language model for speech recognition
I am working with a end to emd speech recognition system. i have language model for a language in .lm extension a and other inference and pronunciation models.I want it to make prediction from that models can any one suggest me how to do it in…
0
votes
0 answers
Training seq2seq LM over multiple iterations in PyTorch, seems like lack of connection between encoder and decoder
My seq2seq model seems to only learn to produce sequences of popular words like:
"i don't . i don't . i don't . i don't . i don't"
I think that might be due to a lack of actual data flow between encoder and decoder.
That happens whether I use…

Valentyn Danylchuk
- 457
- 4
- 11