I have a custom dataset of conversational data specific to farming domain. The spacy-experimental coreference model (en-coreference-web-trf) does perform okish in coreference resolution but does not give the required accuracy. So I need to further fine-tune this model on my domain specific data.
I have found this repo that allows you train a spacy coref model but I am having trouble following the instructions provided. I have my own custom data, but I do not know in what format it should be to send it for training. The repo says, the project is about training the model using OntoNotes dataset, so I thought I can convert my dataset into the format of OntoNotes, but OntoNotes itself is not a public dataset so I do not its file structure. Also does spacy allow us to finetune the model and not train from scratch? I couldn't find any other resources apart from the specified repo related to my task.
Please do help me with this issue. Thank you.