Highest Voted 'language-model' Questions

0

votes

0 answers

Masked language modeling - masked token/embedding clarification

When Transformers are trained with masked language modeling (or masked image modelling), the input embeddings at masked positions are replaced with a MASK token/learnable mask embedding. I'm wondering how these mask embeddings work - how does the…

asked Apr 28 '23 at 22:45

clueless

211
2
3
7

0

votes

0 answers

Creating a language model from scratch with spaCy with POS-tagged corpus and word embeddings

I am trying to build and train a new language in spaCy from scratch, but I am struggling with how to configure spaCy for the initial training. Some notes on current resources: I already have word embeddings from a corpus of around 150 million…

nlp spacy word-embedding pos-tagger language-model

asked Apr 25 '23 at 15:40

jlrl

11
2

0

votes

0 answers

Endless loop in a text generation script

I am trying to make a simple text generator using the Bulgarian language but my code is stuck in an endless loop. Here is the code: from tokenization import tokenize_bulgarian_text from nltk import bigrams, trigrams from collections import Counter,…

python machine-learning nltk markov-chains language-model

asked Apr 21 '23 at 17:57

mark-de

1
1

0

votes

0 answers

Is it possible to mask only part of an embedding during masked 'language' modeling?

I'm using Transformers to process time-series data. Each X second time window of data (from S sensors) is embedded into F features before being inputted to the Transformer. Each F/S span of the embedding corresponds to features from one sensor's…

nlp huggingface-transformers transformer-model attention-model language-model

asked Apr 12 '23 at 18:20

clueless

211
2
3
7

0

votes

0 answers

Finetuning BERT vs BERT + Spacy for text classification

I am a bit confused on the difference between/advantage of finetuning BERT or other LLMs for text classification instead of just using the BERT embeddings with a Spacy pipeline. I believe by using the Spacy pipeline, speed and flexibility (different…

python spacy bert-language-model language-model

asked Apr 03 '23 at 02:05

kbmmoran

1
1

0

votes

0 answers

Using BERT to generate technical skills from a set of activities just outputs the input data

I'm trying to use jobspanbert to generate technical IT skills from a column of job_activities, which is textual data describing the activities the employee does at his job. The model ran for 2 hours straight so you can imagine how excited I was to…

python huggingface-transformers bert-language-model transformer-model language-model

asked Apr 03 '23 at 01:27

Moe_blg

71
4

0

votes

0 answers

Laplace Smoothing - Greater perplexity of model language when increase the N of N-Gram Model

I'm training a Language Model using NLTK library of Python. To obtain a better result, I use the Laplace smoothing technique. But when I increase the N of N-gram model, my perplexity increases too, and I was expecting that the perplexity would…

nltk smoothing n-gram language-model perplexity

asked Mar 30 '23 at 01:45

Leticia

1
1

0

votes

0 answers

Why do I need to add --discount_fallback?

I have simple English file: I'm Harry Potter Harry Potter is young wizard Hermione Granger is Harry friend There are seven fantasy novels of Harry Potter I'm running the following command: lmplz -o 3 myTest.arpa And getting…

nlp n-gram language-model kenlm

asked Mar 19 '23 at 14:02

user3668129

4,318
6
45
87

0

votes

0 answers

Inferring a large language model on a GPU with not enough video RAM

I'm trying some experiments running downloaded language models on a desktop machine. Specifically so far Bloom 3B and 7B on a machine with 32GB RAM, a 2-core CPU and no GPU. (Throughout this question, I will be talking only about inferring –…

machine-learning memory deep-learning gpu language-model

asked Mar 15 '23 at 03:14

rwallace

31,405
40
123
242

0

votes

0 answers

How to train a language model for my data

I have a dataset of IDs that are meaningful to me. I want to use language models to generate IDs based on a few IDs that I give as a starting point. Let's say my dataset is like a sequence of IDs in each line separated by whitespace, more…

huggingface-transformers seq2seq language-model bart

asked Mar 07 '23 at 23:15

Ali

96
6

0

votes

1 answer

I want to make an AI text classifier using OpenAI API, based on GPT2 but i cannot find the API documentation for the GPT2

I wanted to create an AI text classifier project for my college, I wanted to use GPT2 API for the same as it is more reliable to catch the content generated by GPT 3.5, so how can I use GPT2 documentation? also any useful resources for the same are…

machine-learning artificial-intelligence openai-api language-model gpt-2

asked Mar 07 '23 at 15:33

golusharma

11
2

0

votes

0 answers

OutOfMemoryError when I create model embeddings

Just started learning huggingface transformers. I am trying to create embeddings of a large amount of text but I always run into outOfMemoryErrors. I am not sure what I am doing wrong. I am new to python and transformers. Here is my code…

python word-embedding transformer-model huggingface language-model

asked Feb 25 '23 at 02:26

frequencyComponent

9
2

0

votes

0 answers

Supervised fine tuning in pre-trained language model

Supervised find turning adds a extra output layer to the pre-trained model. Does this extra layer alter the probability of words that are not related to the fine tune data?

nlp transformer-model language-model fine-tune

asked Feb 23 '23 at 17:26

Chen APD

1

0

votes

1 answer

How to use language model for speech recognition

I am working with a end to emd speech recognition system. i have language model for a language in .lm extension a and other inference and pronunciation models.I want it to make prediction from that models can any one suggest me how to do it in…

python-3.x deep-learning recurrent-neural-network speech-to-text language-model

asked Feb 22 '23 at 06:05

Voleti Nagendra kumar

11
1

0

votes

0 answers

Training seq2seq LM over multiple iterations in PyTorch, seems like lack of connection between encoder and decoder

My seq2seq model seems to only learn to produce sequences of popular words like: "i don't . i don't . i don't . i don't . i don't" I think that might be due to a lack of actual data flow between encoder and decoder. That happens whether I use…

machine-learning pytorch seq2seq language-model

asked Jan 27 '23 at 17:44

Valentyn Danylchuk

457
4
11

Questions tagged [language-model]