Questions tagged [allennlp]

An open-source NLP research library, built on PyTorch

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.

Quick Links

210 questions
1
vote
1 answer

How to interpret Allen NLP Coreference resolution model output?

I am working on extracting people and tasks from texts (multiple sentences) and need a way to resolve coreferencing. I found this model, and it seems very promising, but once I installed the required libraries allennlp and allennlp_models and…
GSwart
  • 201
  • 2
  • 9
1
vote
1 answer

Google mT5-small configuration error because number attention heads is not divider of model dimension

The configuration file for the HuggingFace google/mt5-small Model (https://huggingface.co/google/mt5-small) defines { ... "d_model": 512, ... "num_heads": 6, ... } Link to the config file:…
MSLars
  • 13
  • 2
1
vote
0 answers

Predictor.from_path('coref-spanbert-large-2021.03.10.tar.gz') downloads model into cache though I provide a local copy of the model

I am trying to load a local copy of the coref-spanbert model using Predictor.from_path but it starts downloading the model again into cache/huggingface. Can anyone help me to fix this. >>> from allennlp.predictors import Predictor >>> coref_model =…
Irshad Bhat
  • 8,479
  • 1
  • 26
  • 36
1
vote
0 answers

Predictor.from_archive failed

archive = load_archive( "elmo-constituency-parser-2018.03.14.tar.gz" ) predictor = Predictor.from_archive(archive, 'constituency-parser') predictor.predict_json({"sentence": "This is a sentence to be predicted!"}) Loading the…
River Hope
  • 11
  • 1
1
vote
0 answers

Is there a way to change the tokenizer in AllenNLP's coreference resolution model?

Does anyone know how to change the tokenizer in AllenNLP's coreference resolution? By default, it uses SpaCy and I would like to use a white space tokenizer so as to tokenize only words, not punctuation. This is what I have tried so far but it does…
rosamariar
  • 36
  • 4
1
vote
1 answer

Error when training AllenNLP adversarial bias mitigator using a pretrained masked language model

I'm attempting to create an adversarially debiased bert masked language model using 'AdversarialBiasMitigator' alongside the AllenNLP pretrained MLM (from here:…
1
vote
1 answer

Conjunction issue in OPENIE 6

I am using OPENIE6 (https://github.com/dair-iitd/openie6) with the following input:- President Trump met the leaders of India and China. But I am getting only one triplet:- ARG1 = President trump V = met ARG2 = the leaders of India and…
1
vote
1 answer

AllenNLP BERT SRL input format ("OntoNotes v. 5.0 formatted")

The goal is to train BERT SRL on another data set. According to configuration, it requires conll-formatted-ontonotes-5.0. Natively, my data comes in a CoNLL format and I converted it to the conll-formatted-ontonotes-5.0 format of the GitHub edition…
Chiarcos
  • 324
  • 1
  • 10
1
vote
0 answers

Is it possible to use the AllenNLP Semantic Role Labeler with BERT-Large instead of BERT-base?

The BERT-based SRL model that Shi and Lin develop (which is currently the the backend of the AllenNLP SRL model) has more consistent advantages over Ouichi et al.'s (2018) ensemble model when using BERT-large, instead of BERT-base. For example, the…
Russell Richie
  • 421
  • 1
  • 5
  • 15
1
vote
0 answers

Log loss and metrics on INFO level

I use AWS SageMaker to run a training with AllenNLP. In order to track the loss and metrics I need to have them printed on the log INFO level during training (or at least after each epoch). However, when I run the training all loss and metric…
Richard
  • 514
  • 3
  • 9
1
vote
0 answers

disabling fast tokenization in allennlp models

Dear stackoverflow community, I have the following question: Is it possible to disable fast tokenization in an allennlp model? I am trying to use the following model in my nlp pipeline but can't use the fast tokenization as it causes issues when…
1
vote
0 answers

No module named 'allennlp.data.tokenizers.word_splitter'

I'm using python 3.7 in google colab. I install allennlp=2.4.0 and allennlp-models. When I run my code, I get this error: from allennlp.data.tokenizers.word_splitter import SpacyWordSplitter ModuleNotFoundError: No module…
1
vote
1 answer

How to continue training serialized AllenNLP model using `allennlp train`?

Currently training models using AllenNLP 1.2: allennlp train -f --include-package custom-exp /usr/training_config/mock_model_config.jsonnet -s test-mock-out The config is very standard: "dataset_reader" : { "reader": "params" }, …
1
vote
1 answer

Get an item value from a nested dictionary inside the rows of a pandas df and get rid off the rest

I implemented allennlp's OIE, which extracts subject, predicate, object information (in the form of ARG0, V, ARG1 etc) embedded in nested strings. However, I need to make sure that each output is linked to the given ID of the original sentence. I…
blah
  • 674
  • 3
  • 17
1
vote
1 answer

AllenNLP Multi-Task Model: Keep encoder weights for new heads

I have trained a (AllenNLP) multi-task model. I would like to keep the encoder/backbone weights and continue training with new heads on new datasets. How can I do that with AllenNLP? I have two basic ideas for how to do that: I followed this…
sinaj
  • 129
  • 1
  • 1
  • 10