0

I am pretty new to allennlp and I am struggling with building a model that does not seem to fit perfectly in the standard way of building model in allennlp.

I want to build a pipeline model using NLP. The pipeline consists basically of two models, let's call them A and B. First A is trained and based on the prediction of the full train A, B trained afterwards.

What I have seen is that people define two separate models, train both using the command line interface allennlp train ... in a shell script that looks like

# set a bunch of environment variables
...
allennlp train -s $OUTPUT_BASE_PATH_A --include-package MyModel --force $CONFIG_MODEL_A

# prepare environment variables for model b
...
allennlp train -s $OUTPUT_BASE_PATH_B --include-package MyModel --force $CONFIG_MODEL_B

I have two concerns about that:

  1. This code is hard to debug
  2. It's not very flexible. When I want to do a forward pass of the fully trained model I have write another script that bash script that does that.

Any ideas on how to do that in a better way?

I thought about using a python script instead of a shell script and invoke allennlp.commands.main(..) directly. Doing so at least you have a joint python module you can run using a debugger.

1 Answers1

0

There are two possibilities.

If you're really just plugging the output of one model into the input of another, you could merge them together into one model and run it that way. You can do this with two already-trained models if you initialize the combined model with the two trained models using a from_file model. To do it at training time is a little harder, but not impossible. You would train the first model like you do now. For the second step, you train the combined model directly, with the inner first model's weights frozen.

The other thing you can do is use AllenNLP as a library, without the config files. We have a template up on GitHub that shows you how to do this. The basic insight is that everything you configure in one of the Jsonnet configuration files corresponds 1:1 to a Python class that you can use directly from Python. There is no requirement to use the configuration files. If you use AllenNLP this way, have much more flexibility, including chaining things together.

Dirk Groeneveld
  • 2,547
  • 2
  • 22
  • 23
  • Thank you for pointing those two possibilities out. Is it possible to do kind of a hybrid approach of both? Load the models from the jsonet files separately and then chain them together? I am just asking cause I am basing my project on an exiting implementation where both models and different configs already exist. – user3411517 Jan 18 '21 at 10:07
  • You can definitely combine the two approaches, but it seems to me that you can just get away with approach #1. You just need to write a custom model class that takes in your two "child" models. – Dirk Groeneveld Jan 18 '21 at 19:47