Citing RFC 7296, section 2.4, paragraph 3:
Since IKE is designed to operate in spite of DoS attacks from the
network, an endpoint MUST NOT conclude that the other endpoint has
failed based on any routing information (e.g., ICMP messages) or IKE
messages that arrive without cryptographic protection (e.g., Notify
messages complaining about unknown SPIs). An endpoint MUST conclude
that the other endpoint has failed only when repeated attempts to
contact it have gone unanswered for a timeout period or when a
cryptographically protected INITIAL_CONTACT notification is received
on a different IKE SA to the same authenticated identity. An
endpoint should suspect that the other endpoint has failed based on
routing information and initiate a request to see whether the other
endpoint is alive. To check whether the other side is alive, IKE
specifies an empty INFORMATIONAL request that (like all IKE requests)
requires an acknowledgement (note that within the context of an IKE
SA, an "empty" message consists of an IKE header followed by an
Encrypted payload that contains no payloads). If a cryptographically
protected (fresh, i.e., not retransmitted) message has been received
from the other side recently, unprotected Notify messages MAY be
ignored. Implementations MUST limit the rate at which they take
actions based on unprotected messages.
I think that (for the sake of clarity) the relevant types of an attacker should be considered:
1/ An attacker able to drop arbitrary packets (i.e. an active MitM)
- this one is able to perform DOS just by dropping packets and AFAIK there is nothing that can prevent him doing so. He does not need any sophistication to break the communication.
2/ An attacker unable to drop packets
this one can not prevent peer_2's legitimate responses (to peer_1's INFORMATIONAL requests) reaching peer_1.
thus peer_1 receives the response (before all retries timeout) and knows that peer_2 is alive.
3/ An attacker able to drop some packets
- then it is a race and the outcome depends on the configuration of the peers and the percentage of packets the attacker is able to drop.
EDIT>
I would understand the questioned "case 2 attacker" scenario this way:
by receiving the attacker's unprotected INVALID_IKE_SPI notify (spoofed by the attacker from peer_2's address) peer_1 can (at most) only suspect that peer_2 has failed (as it MUST not conclude that the other endpoint has failed based on IKE massages without cryptographic protection)
it may decide (see note below) to issue a liveness check by sending an empty INFORMATIONAL request to peer_2 (which is cryptographically protected)
the "case 2 atacker" is unable to tamper with this request, so it should reach peer_2 (it might involve some implementation specific retransmits, as specified)
peer_2 (as it is alive) responds with an acknowledgement (which is cryptographically protected)
the "case 2 atacker" is unable to tamper with this response, so it should reach peer_1
upon receiving this response (which is a fresh, cryptographically protected message from peer_2), peer_1 knows that peer_2 is alive and keeps the SAs (as nothing has happened)
Note: The "Implementations MUST limit the rate at which they take actions based on unprotected messages" part means, that peer_1 should not perform this liveness check on every unprotected Notify message received and some implementation specific rate limiting mechanism must be in place (probably to prevent traffic amplification).
Desclaimer: I am no crypto expert, so please do validate my thoughts.