I needed to train (fine tune) NER token classifier to recognize our custom tokens. The easiest way to do that I found was: Token Classification with W-NUT Emerging Entities
But now I encountered a problem - the plan was to follow: HuggingFace in Spark NLP - BERT Sentence.ipynb , but when I try:
model.save_pretrained(<path on DBFS>)
I get file write error. As far as I understand this is because tranformers/keras won't work on distributed file systems like DBFS
Is there any walkaround for this?
I cannot move training away from databricks because I'm using data (entities) from the database to create training file
PS. Maybe I can do the same using only spark nlp? How- prefarably using same "tag only" format?