3

i created a VM with terraform on GCP and also a persistent disk separately with the google_compute_disk , google_compute_resource_policy, google_compute_disk_resource_policy_attachment resources to also attach a snapshot schedule to the disk

This was two days ago, and no snapshot has been created.
Had anybody run into similar issues ?

The schedule is set to daily.

This is the terraform config that i used


resource "google_compute_disk" "fast_storage" {
  name = "${var.env}-fast-disk"
  type = "pd-ssd"
  size = 50 #GiB
  zone = var.zone
  labels = {
    environment = var.env
    type        = "ssd"

  }
  physical_block_size_bytes = 4096
}


resource "google_compute_resource_policy" "backup_policy" {
  name   = "${var.env}-backup-policy"
  region = var.region
  snapshot_schedule_policy {
    schedule {
      daily_schedule {
        days_in_cycle = 1
        start_time    = "04:00"
      }
    }
    retention_policy {
      max_retention_days    = 5
      on_source_disk_delete = "KEEP_AUTO_SNAPSHOTS"
    }
    snapshot_properties {
      labels = {
        environment = var.env
        type        = "ssd"
      }
      storage_locations = ["eu"]
      guest_flush       = true
    }
  }

}


resource "google_compute_disk_resource_policy_attachment" "backup_policy_attachment" {
  name = google_compute_resource_policy.backup_policy.name
  disk = google_compute_disk.fast_storage.name
  zone = var.zone
}


resource "google_compute_instance" "main" {
  name                      = "${var.env}-main-server"
  machine_type              = "custom-2-4096"
  zone                      = var.zone
  allow_stopping_for_update = true
  desired_status            = "RUNNING"
  deletion_protection       = true
  tags                      = ["${var.env}-main-server"]

  boot_disk {
    auto_delete = false

    mode = "READ_WRITE"
    initialize_params {
      image = "debian-cloud/debian-10"
      type  = "pd-ssd"
      size  = 20
    }
  }

  network_interface {
    network    = var.network_id
    subnetwork = var.subnetwork_id
    network_ip = google_compute_address.main_server_internal.address

    access_config {
      nat_ip = google_compute_address.main_server_external.address
    }
  }
  scheduling {
    on_host_maintenance = "MIGRATE"
    automatic_restart   = true
  }

  lifecycle {
    ignore_changes = [attached_disk]
  }
}


resource "google_compute_attached_disk" "fast_storage" {
  disk        = google_compute_disk.fast_storage.id
  instance    = google_compute_instance.main.id
  device_name = "fast"
  mode        = "READ_WRITE"
}
Alex Duzsardi
  • 88
  • 1
  • 5

2 Answers2

3

Set guest_flush = false (this only works with Windows and looks like it's non-negotiable with gcp.

Check Stackdriver Logs - Disks

Stuggi
  • 3,506
  • 4
  • 19
  • 36
  • Thank you William , indeed that was the issue, a trimmed error message/log "message": "You can only use guest-flush on disks attached to instances with supported operating systems. Make sure you have the latest image version and agent installed with required services (e.g. VSS for Windows)." – Alex Duzsardi Nov 24 '20 at 14:13
3

this only works with Windows and looks like it's non-negotiable with gcp

This is not true. Please, follow the article Creating a Linux application consistent persistent disk snapshot where it is explained what is necessary, which basically is:

  1. Add [Snapshots] section to /etc/default/instance_configs.cfg with enabled = true and restart the agent with sudo systemctl restart google-guest-agent.service. The last command creates /etc/google/snapshots directory if it does not exist yet;
  2. Create pre and post snapshot scripts at /etc/google/snapshots/pre.sh and /etc/google/snapshots/post.sh respectively. Make sure scripts are executable to root (this is not explicitly stated in documentation, but easy to figure out); and
  3. Create a snapshot (or snapshot schedule) with guest-flush enabled.

Just verified it working in GCP.

I you have not prepared your VM instance(s) as explained above you would have errors trying to create an application consistent snapshots like below:

  • If 1. above is not done: Operation type [createSnapshot] failed with message "You can only use guest-flush on disks attached to instances with supported operating systems. Make sure you have the latest image version and agent installed with required services (e.g. VSS for Windows).";
  • If 2. above is not done: Operation type [createSnapshot] failed with message "Guest Consistent Snapshot failed (Detail: pre_snapshot_script or post_snapshot_script not found). Verify you are running the latest image version and agent. For non-Windows, review settings in /etc/default/instance_configs.cfg on the instance. For more information, see the Guest Consistent Snapshot documentation." or the following if scripts are not made executable: Operation type [createSnapshot] failed with message "Guest Consistent Snapshot failed (Detail: unhandled script return code -1). Verify you are running the latest image version and agent. For non-Windows, review settings in /etc/default/instance_configs.cfg on the instance. For more information, see the Guest Consistent Snapshot documentation.".

And if you have snapshot schedules not producing snapshots, please check for errors like You can only use guest-flush on disks attached to instances with supported operating systems. Make sure you have the latest image version and agent installed with required services (e.g. VSS for Windows). in Logs Explorer, which could be found easily with a query like

severity=ERROR`
resource.type="gce_disk"
protoPayload.methodName="ScheduledSnapshots"

I means you have not prepared your VMs like it is explained above.