I am trying to fine-tune a Donut (Document Understanding) Huggingface Transformer model, but am getting hung up trying to create a DonutDataset
object. I have the following code (running in google colab):
!pip install transformers datasets sentencepiece donut-python
from google.colab import drive
from donut.util import DonutDataset
from transformers import DonutProcessor, VisionEncoderDecoderModel, VisionEncoderDecoderConfig
drive.mount('/content/drive/')
projectdir = 'drive/MyDrive/donut'
donut_version = 'naver-clova-ix/donut-base-finetuned-cord-v2' # 'naver-clova-ix/donut-base'
config = VisionEncoderDecoderConfig.from_pretrained(donut_version)
config.decoder.max_length = 768
processor = DonutProcessor.from_pretrained(donut_version)
model = VisionEncoderDecoderModel.from_pretrained(donut_version, config=config)
train_dataset = DonutDataset(f'{projectdir}/input_doc_images',
model,
#'naver-clova-ix/donut-base-finetuned-cord-v2',
max_length=config.decoder.max_length,
split="train",
task_start_token="",
prompt_end_token="",
sort_json_key=True,
)
...however, the last line is throwing the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-9d831be996e6> in <cell line: 4>()
2
3 max_length = 768
----> 4 train_dataset = DonutDataset(f'{projectdir}/input_doc_images',
5 model,
6 #'naver-clova-ix/donut-base-finetuned-cord-v2',
2 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
1612 if name in modules:
1613 return modules[name]
-> 1614 raise AttributeError("'{}' object has no attribute '{}'".format(
1615 type(self).__name__, name))
1616
AttributeError: 'VisionEncoderDecoderModel' object has no attribute 'json2token'
I'm a little confused because my model
object is a 'naver-clova-ix/donut-base-finetuned-cord-v2'
model, which according to this line from the model.py of the Donut github repo seems to suggest does in fact have a json2token
method???
What am I missing?
btw, you can view/copy my underlying data (images and json-lines metdata file) from my google drive 'donut' folder here: https://drive.google.com/drive/folders/1Gsr7d7Exvtx5PqjZQv2nXP9-pPDUEIOx?usp=sharing