Printing training progress with Keras using QSUB and a bash file

Question

I'm able to run a python script that trains a model using Keras/Tensorflow with the following bash script:

#!/bin/bash
#PBS -N Tarea_UNET
#PBS -l nodes=1:ppn=4:gpus=1
cd $PBS_O_WORKDIR
source $ANACONDA3/activate inictel_uni
python U-NET.py

Inside "U-NET.py" the training function goes like this:

history=model.fit(train_B,train_A, epochs = 200, batch_size = 20, validation_split=0.052631578, shuffle=True)

The problem is I can't visualize the training progress that helps me to monitor the metrics or see the estimated training time and I've got to wait until the whole process finishes. "qstat" gives me only the time it has been running the code, so it's useless. Do you have any ideas?

score 1 · Accepted Answer · answered Apr 14 '18 at 00:04

One simple approach is to provide a callback for Keras to invoke at the right times. You can do whatever logging, progress reporting you want in this callback.

Here is the high-level documentation and some pre-made callbacks: https://keras.io/callbacks/

Usage is very simple. You just pass a list of callback to fit

model.fit(x_train, y_train, ... callbacks=[<your_callbacks>])

See examples at the end of the doc.

You can see all the methods that you can override here: https://github.com/keras-team/keras/blob/adc321b4d7a4e22f6bdb00b404dfe5e23d4887aa/keras/callbacks.py#L146

I will use this. Thanks – Giorgio Luigi Morales Luna Apr 15 '18 at 01:18 — Giorgio Luigi Morales Luna, Apr 15 '18 at 01:18

Printing training progress with Keras using QSUB and a bash file

1 Answers1