0

I use https://github.com/tensorflow/cleverhans to generate adversarial images, but the categories of attack algoritm is not provided.

All the attack algorithm codes are listed here: https://github.com/tensorflow/cleverhans/tree/master/cleverhans/attacks

I don't know which of these attack algorithm is grey box attack and which is white or black attack algorithm?

Because I need the category of algorithm to reasearch the attack defense algorithm. The github page doesn't provide any information about this. How should I know?

machen
  • 283
  • 2
  • 10
  • You need to provide [minimal,complete and verifiable example](https://stackoverflow.com/help/mcve) . Per your question we can ‘not see how you generate them. – n1tk Jan 30 '19 at 09:38
  • @n1tk, I mean the cleverhans lib has already provide 15 attack algorithm (FGSM, C&W L2, deepfool, etc) , but the categories of them are not provided. Which of them are black box attack algorithms? Which of them are white box attack algoritm? – machen Jan 30 '19 at 14:01
  • Futhermore, there are bugs inside VirtualAdversarialMethod, L2 norm version of MomentumIterativeMethod, L2 norm of BasicIterativeMethod and L inf norm of deep fool, I have already upload the whole example script code to https://github.com/tensorflow/cleverhans/issues/948 – machen Jan 30 '19 at 14:02
  • But such bugs have not been solved yet. – machen Jan 30 '19 at 14:02

1 Answers1

0

Will start with referencing the paper Towards Evaluating the Robustness of Neural Networks by Carlini from page2 last paragraph: the adversary has complete access to a neural network, including the architecture and all paramaters, and can use this in a white-box manner. This is a conservative and realistic assumption: prior work has shown it is possible to train a substitute model given black-box access to a target model, and by attacking the substitute model, we can then transfer these attacks to the target model.

Making the following 2 definitions, as follow, true:

White-Box: Attackers know full knowledge about the ML algorithm, ML model, (i.e ., parameters and hyperparameters), architecture, etc. Figure below does show an example how does work white-box attacks:

GAN white-box model architecture

Black-Box: Attackers almost know nothing about the ML system (perhaps know number of features, ML algorithm). Figure below points steps as an example:

GAN black-box model architecture

Section 3.4 DEMONSTRATION OF BLACK BOX ADVERSARIAL ATTACK IN THE PHYSICAL WORLD from paper ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLDby Kurakin, 2017 states the following: Paragraph 1 page 9 describe white-box meaning: The experiments described above study physical adversarial examples under the assumption that adversary has full access to the model (i.e. the adversary knows the architecture, model weights, etc . . . ).

And following with explanation of the black-box meaning: However, the black box scenario, in which the attacker does not have access to the model, is a more realistic model of many security threats.

Conclusion: In order to define/label/classify the algorithms as white-box/black-box you just change the settings for the model.

Note: I don't classify each algorithm because some of the algorithms can support only white-box settings or only black-box settings in the cleverhanslibrary but is good start for you (if you do research than you need to check every single paper listed in the documentation to understand the GAN so you can generate on your own adversarial examples.

Resources used and interesting papers:

  1. BasicIterativeMethod: BasicIterativeMethod
  2. CarliniWagnerL2: CarliniWagnerL2
  3. FastGradientMethod: https://arxiv.org/pdf/1412.6572.pdf https://arxiv.org/pdf/1611.01236.pdf https://arxiv.org/pdf/1611.01236.pdf
  4. SaliencyMapMethod: SaliencyMapMethod
  5. VirtualAdversarialMethod: VirtualAdversarialMethod
  6. Fgsm Fast Gradient Sign Method: Fgsm Fast Gradient Sign Method
  7. Jsma Jacobian-based saliency map approach:JSMA in white-box setting
  8. Vatm virtual adversarial training: Vatm virtual adversarial training
  9. Adversarial Machine Learning — An Introduction With slides from: Binghui Wang
  10. mnist_blackbox
  11. mnist_tutorial_cw
n1tk
  • 2,406
  • 2
  • 21
  • 35