Questions tagged [language-model]

266 questions
0
votes
0 answers

Masked language modeling - masked token/embedding clarification

When Transformers are trained with masked language modeling (or masked image modelling), the input embeddings at masked positions are replaced with a MASK token/learnable mask embedding. I'm wondering how these mask embeddings work - how does the…
0
votes
0 answers

Creating a language model from scratch with spaCy with POS-tagged corpus and word embeddings

I am trying to build and train a new language in spaCy from scratch, but I am struggling with how to configure spaCy for the initial training. Some notes on current resources: I already have word embeddings from a corpus of around 150 million…
jlrl
  • 11
  • 2
0
votes
0 answers

Endless loop in a text generation script

I am trying to make a simple text generator using the Bulgarian language but my code is stuck in an endless loop. Here is the code: from tokenization import tokenize_bulgarian_text from nltk import bigrams, trigrams from collections import Counter,…
0
votes
0 answers

Is it possible to mask only part of an embedding during masked 'language' modeling?

I'm using Transformers to process time-series data. Each X second time window of data (from S sensors) is embedded into F features before being inputted to the Transformer. Each F/S span of the embedding corresponds to features from one sensor's…
0
votes
0 answers

Finetuning BERT vs BERT + Spacy for text classification

I am a bit confused on the difference between/advantage of finetuning BERT or other LLMs for text classification instead of just using the BERT embeddings with a Spacy pipeline. I believe by using the Spacy pipeline, speed and flexibility (different…
0
votes
0 answers

Using BERT to generate technical skills from a set of activities just outputs the input data

I'm trying to use jobspanbert to generate technical IT skills from a column of job_activities, which is textual data describing the activities the employee does at his job. The model ran for 2 hours straight so you can imagine how excited I was to…
0
votes
0 answers

Laplace Smoothing - Greater perplexity of model language when increase the N of N-Gram Model

I'm training a Language Model using NLTK library of Python. To obtain a better result, I use the Laplace smoothing technique. But when I increase the N of N-gram model, my perplexity increases too, and I was expecting that the perplexity would…
Leticia
  • 1
  • 1
0
votes
0 answers

Why do I need to add --discount_fallback?

I have simple English file: I'm Harry Potter Harry Potter is young wizard Hermione Granger is Harry friend There are seven fantasy novels of Harry Potter I'm running the following command: lmplz -o 3 myTest.arpa And getting…
user3668129
  • 4,318
  • 6
  • 45
  • 87
0
votes
0 answers

Inferring a large language model on a GPU with not enough video RAM

I'm trying some experiments running downloaded language models on a desktop machine. Specifically so far Bloom 3B and 7B on a machine with 32GB RAM, a 2-core CPU and no GPU. (Throughout this question, I will be talking only about inferring –…
rwallace
  • 31,405
  • 40
  • 123
  • 242
0
votes
0 answers

How to train a language model for my data

I have a dataset of IDs that are meaningful to me. I want to use language models to generate IDs based on a few IDs that I give as a starting point. Let's say my dataset is like a sequence of IDs in each line separated by whitespace, more…
Ali
  • 96
  • 6
0
votes
1 answer

I want to make an AI text classifier using OpenAI API, based on GPT2 but i cannot find the API documentation for the GPT2

I wanted to create an AI text classifier project for my college, I wanted to use GPT2 API for the same as it is more reliable to catch the content generated by GPT 3.5, so how can I use GPT2 documentation? also any useful resources for the same are…
0
votes
0 answers

OutOfMemoryError when I create model embeddings

Just started learning huggingface transformers. I am trying to create embeddings of a large amount of text but I always run into outOfMemoryErrors. I am not sure what I am doing wrong. I am new to python and transformers. Here is my code…
0
votes
0 answers

Supervised fine tuning in pre-trained language model

Supervised find turning adds a extra output layer to the pre-trained model. Does this extra layer alter the probability of words that are not related to the fine tune data?
0
votes
1 answer

How to use language model for speech recognition

I am working with a end to emd speech recognition system. i have language model for a language in .lm extension a and other inference and pronunciation models.I want it to make prediction from that models can any one suggest me how to do it in…
0
votes
0 answers

Training seq2seq LM over multiple iterations in PyTorch, seems like lack of connection between encoder and decoder

My seq2seq model seems to only learn to produce sequences of popular words like: "i don't . i don't . i don't . i don't . i don't" I think that might be due to a lack of actual data flow between encoder and decoder. That happens whether I use…