Evaluating a tensorflow model on different levels of augmented data

Question

I am trying to build a noise-robust audio classification model. To evaluate my results I would like to run a loop to use model.evaluate() on a tf.dataset object that consists of my validation data augmented with different levels of noise. Here is my code so far:

# get noise data to mix with validation data for model evaluation:
en_data_val = tf.data.Dataset.from_tensor_slices(noise_files[:len(valid_files)])
# get samples from filenames:
en_data_val = en_data_val.map(parse_en, num_parallel_calls=AUTOTUNE)
# merg noise and sound valdiation sound data sets:
validation_data_en = tf.data.Dataset.zip((en_data_val, validation_data))    
# cache this state: 
validation_data_en = validation_data_en.cache()
    
for noise_level in [0, 0.01, 0.05, 0.1, 0.2, 0.4, 0.8, 1]:
    # Mix the sounds:
    validation_data_en_eval = validation_data_en.map(mix_with_noise_val, num_parallel_calls=AUTOTUNE)
    # Apply filter:
    validation_data_en_eval = validation_data_en_eval.map(preprocess, num_parallel_calls=AUTOTUNE)
    # Convert audio to spectrogramm:
    validation_data_en_eval = validation_data_en_eval.map(parse_to_spec, num_parallel_calls=AUTOTUNE)
    # now create batches:
    validation_data_en_eval = validation_data_en_eval.batch(256)
    # allow to prefetch a second batch while the first one ist processed:
    validation_data_en_eval = validation_data_en_eval.prefetch(AUTOTUNE)
    
    # Evaluate modell:
    eval_loss, eval_cat_acc, eval_precision, eval_recall = model.evaluate(validation_data_en_eval, verbose=1)
    
    # Save results to .txt:
    save_eval_results(noise_level, eval_loss, eval_cat_acc, eval_precision, eval_recall)

Here validation_data is the tf.dataset without noise that I previously used to train my model.

Running this gives me the same result for every loop run, which tells me that tf is not rebuilding the dataset object for every loop run the way I intended.

How to fix this?

score 0 · Answer 1 · answered Oct 07 '22 at 11:24

Found my mistake. Passing the noise_level explicitly to the mix_with_noise_val() function instead of just changing the global variable in the for-loop fixed my problem.

Doing so was a little tricky but I found the solution given by Faylixe here using partial from functools worked.

from functools import partial
for noise_level in [0, 0.01, 0.05, 0.1, 0.2, 0.4, 0.8, 1]:
    nl = tf.constant(noise_level, dtype=tf.float32)
    # Mix the sounds:
    validation_data_en_eval = validation_data_en.map(partial(mix_with_noise_val, nl), num_parallel_calls=AUTOTUNE)
...

May this help someone having the same problem.

Evaluating a tensorflow model on different levels of augmented data

1 Answers1