Another answerer linked a question with a popular (though not yet accepted) answer that states that Dataflow will ACK the message once, for the bundle the message belongs to, "results of the bundle (outputs and state mutations etc) have been durably committed" (At what stage does Dataflow/Apache Beam ack a pub/sub message?).
It's important to note that Dataflow needs to commit state when there are stateful operations in your pipeline. For example, with windowing, Dataflow needs to stash your data somewhere while it waits for the window to pass, at which point it pulls the state back out and sends it off to the next part of your pipeline.
This behavior actually matches what I've observed using Dataflow in production for a few years now. We used to have a stateless pipeline (no windowing, etc) and it NACKed messages when exceptions occurred in any part of the pipeline. When we added windowing, we noticed it ACKing the messages even though the window the message belonged to has not yet passed (and nothing had been output at the end of the pipeline into the sink).
Therefore, the situation you're concerned about, where messages are ACKed even though the message is "bad" will occur, in pipelines that have stateful operations, because the message won't be deemed "bad" by your code until after it has been ACKed so that it can be durably committed. The situation won't occur, and you can safely rely on a NACK for these "bad" messages, if your pipeline has no stateful operations (and all stateless operations finish within the ACK deadline you've configured for your Pub/Sub subscription).
If this is a problem for you, because you have stateful operations in your pipeline, I'd suggest one of two things:
- Add validation before the Pub/Sub message is published, such that no "bad" messages will enter your pipeline, or
- Break up your pipeline into two pipelines, one stateless and one stateful, such that messages will only be deemed "bad" in the first pipeline, and can be retried later when the pipeline is updated to no longer deem the message "bad" or the message is discarded through other means if it isn't needed