I have a few questions about training and evaluating AllenNLP's coreference resolution model.
Are there any constraints/specifications on what GPUs should be used for training? I get an OOM issue midway through training on a Titan RTX GPU with 24220 MiB memory. Are there any parameters I can change that might help (note: I am using the BERT instead of the SpanBERT version)?
I noticed that the model usage examples use an already trained and stored model. Can we instead specify a model path from a model we have trained?
Can we substitute
roberta-base
withbert-base-uncased
in the coref_bert-lstm.jsonnet file, or are other modifications necessary to make this change?