I currently have reproducibility issues, although I set the seeds. I know that the model is initialized the same way (checked via inspection of model.save("initial.h5")
with h5dump
and meld).
The next thing for me to check if the training samples are used in the same order. Hence I would like to log them.
I train via
model.fit(dataset['train']['X'],
dataset['train']['y'],
epochs=cfg['model']['nb_epochs'],
batch_size=cfg['model']['batch_size'],
validation_split=cfg['model']['validation_split'],
callbacks=[checkpoint], class_weight=cw)
I can also add dataset['train']['id']
. I would like to get a txt file which contains the list of IDs being used, e.g. for a batch-size of 32, a training dataset length of 765 and 5 epochs I would expect 765 * 5 = 3825 lines in the txt file where each ID roughly appears 5 times and the first 32 elements are the IDs from the first batch.
Is that possible?