I've implemented what I believe to be the same model training loop in both TensorFlow's neural structured learning (nsl
) and the cleverhans
library, and curiously, they show that models trained using adversarial training with the two libraries (via nsl.AdversarialRegularization
and cleverhans.attacks.FastGradientMethod
) do not achieve comparable performance. However, this question isn't about those specific results, so I don't attempt to replicate them here.
I'm curious more generally about what the implementation differences are for adversarial perturbation in nsl.AdversarialRegularization.perturb_on_batch()
versus the cleverhans
implementation of the same/similar functionality, which would be FastGradientMethod.generate()
.
The nsl
docs aren't especially clear, but they seem to imply that nsl
is using the Fast Gradient Sign Method of Goodfellow et al. 2014, which is supposedly the same method implemented in FastGradientMethod
. For example, nsl
refers to the Goodfellow et al. paper in the adversarial training tutorial and in some of the function docs. Both libraries allow specification of similar parameters, e.g. an epsilon
to control the level of perturbation and control over the norm used to constrain it. However, the differences in adversarially-trained performance lead me to believe that these libraries are not using the same underlying implementation. nsl
is difficult to parse, so I am particularly curious what might be happening under the hood there.
What are the differences in implementation in nsl.AdversarialRegularization.perturb_on_batch()
and the cleverhans.attacks.FastGradientMethod.generate()
which could cause different perturbations for the same inputs? Are there other differences in these functions which might contribute to differences in their performance (I am not interested in speed or efficiency, but in ways in which the results of the two perturbations might be different for the same model, epsilon, and norm).