5

I have a cloud-init file that sets up all requirements for our AWS instances, and part of those requirements is formating and mounting an EBS volume. The issue is that on some instances volume attachment occurs after the instance is up, so when cloud-init executes the volume /dev/xvdf does not yet exist and it fails.

I have something like:

#cloud-config

resize_rootfs: false
disk_setup:
    /dev/xvdf:
        table_type: 'gpt'
        layout: true
        overwrite: false

fs_setup:
    - label: DATA
      filesystem: 'ext4'
      device: '/dev/xvdf'
      partition: 'auto'

mounts:
    - [xvdf, /data, auto, "defaults,discard", "0", "0"]

And would like to have something like a sleep 60 or something like that before the disk configuration block.

If the whole cloud-init execution can be delayed, that would also work for me.

Also, I'm using terraform to create the infrastructure.

Thanks!

Christian Rodriguez
  • 630
  • 1
  • 8
  • 24
  • actually: Attached devices are from a system perspective hotplug devices. So you can utilize udev rules. Actually most distributions have some to create links the way Amazon Linux does also for other cloud providers. Be warned the disk setup is not available with all distributions. You may need to add the module manually(available in the cloud init source). Restart is simpler, for that more practical. If worried about the hackish nature:: udev, hotplug handler provides the trigger when a disk is attached( attaching is an additional aws call at unknown time...) – kai Aug 05 '22 at 02:15

2 Answers2

4

I guess cloud-init does have an option for running adhoc commands. have a look into this link.

https://cloudinit.readthedocs.io/en/latest/topics/modules.html?highlight=runcmd#runcmd

Not sure what your code looks like, but I just tried to pass the below as user_data in AWS and could see that the init script sleep for 1000 seconds... ( Just added a couple of echo statements to check later). I guess you can add a little more logic as well to verify the presence of the volume.

#cloud-config

runcmd:
 - [ sh, -c, "echo before sleep:`date` >> /tmp/user_data.log" ]
 - [ sh, -c, "sleep 1000" ]
 - [ sh, -c, "echo after sleep:`date` >> /tmp/user_data.log" ]
 
<Rest of the script> 
dur
  • 15,689
  • 25
  • 79
  • 125
Giridhar
  • 508
  • 3
  • 8
  • Forgot to mention that I'd like to keep the formating within the cloud-init space, not relying on commands, since I was given the task to translate the old shell scripts into cloud.init. Anyway, if it's not possible to delay this, then I'd have to give it a try. ;) – Christian Rodriguez Sep 28 '20 at 14:53
  • sure. post back just in case if you happen to find any diff options. also, using "runcmd" is still part of the the cloud-init, but just that you will still executing linux cmds (but still no external shell scripts... so tidy to some extent at least). cheers, good luck! – Giridhar Sep 28 '20 at 19:31
4

I was able to resolve the issue with two changes:

  1. Changed the mount options, adding nofail option.
  2. Added a line to the runcmd block, deleting the semaphore file for disk_setup.

So my new cloud-init file now looks like this:

#cloud-config

resize_rootfs: false
disk_setup:
    /dev/xvdf:
        table_type: 'gpt'
        layout: true
        overwrite: false

fs_setup:
    - label: DATA
      filesystem: 'ext4'
      device: '/dev/xvdf'
      partition: 'auto'

mounts:
    - [xvdf, /data, auto, "defaults,discard", "0", "0"]
    
runcmd:
    - [rm, -f, /var/lib/cloud/instances/*/sem/config_disk_setup]

power_state:
    mode: reboot
    timeout: 30

It will reboot, then it will execute the disk_setup module once more. By this time, the volume will be attached so the operation won't fail.

I guess this is kind of a hacky way to solve this, so if someone has a better answer (like how to delay the whole cloud-init execution) please share it.

Christian Rodriguez
  • 630
  • 1
  • 8
  • 24