For our db maintenance strategy we need to stop a running db before taking a copy of the vm running the db. Since the db is in production, we need to ensure the minimal amount of time for which the db is stopped, but also we want to be sure that the image/procedure is in a state in which we can restart the db without incurring in i/o issue on the image.
in pseudocode:
stop_db gcloud compute machine-images create ${IMAGE_NAME} --source-instance=... restart_db
Is there any difference using disk snapshot instead?
stop_db gcloud compute disks snapshot ${DISK} --snapshot-names=${SNAPSHOT_NAME} --async restart_db
How can we be sure that is safe to restart the db? is there a way to know that the "machine-images create" or "compute disks snapshot" api has "finished" the sync part?
The test we did so far are: test 1 with gcloud compute machine-images create: we check until "state is CREATING" and then restart the db
test 2 with gcloud compute disks snapshot: we check until "state is CREATING" and then restart the db
In both cases the time needed is more than 10 seconds, which is too high for our usecase. The time does not change if the disks are very similar in term of data stored (without delta between them)