1

So I was able to configure PCS and libvirt to work properly together. I can migrate VMs between my two nodes with no problems. All was working well until I needed to place one of my nodes in a standby mode for maintenance. It seems that my resource for the shared filesystem is being taken down before the VMs are able to completely migrate to the other node.

I have tried using ordering and configured an orderly boot up for everything, but when I put a node into standby mode, the resources on BOTH nodes shutdown and are then restarted on the non-standby node. I have played around with the kind=Optional switch on the constraints, but have not gained any ground.

I know this must be an easy answer, but I can't find it. Any help would be appreciated.

Terry

<cib crm_feature_set="3.0.10" validate-with="pacemaker-2.5" epoch="428" num_updates="0" admin_epoch="0" cib-last-written="Mon May 22 16:22:42 2017" update-origin="kvm01" update-client="crm_attribute" update-user="root" have-quorum="1" dc-uuid="2">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog" value="false"/>
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.15-11.el7-e174ec8"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/>
        <nvpair id="cib-bootstrap-options-cluster-name" name="cluster-name" value="kvm"/>
        <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="true"/>
        <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="freeze"/>
        <nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh" value="1495482722"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="1" uname="kvm01">
        <instance_attributes id="nodes-1"/>
      </node>
      <node id="2" uname="kvm02">
        <instance_attributes id="nodes-2"/>
      </node>
    </nodes>
    <resources>
      <primitive class="stonith" id="kvm01_ilo" type="fence_ilo4_ssh">
        <instance_attributes id="kvm01_ilo-instance_attributes">
          <nvpair id="kvm01_ilo-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="kvm01"/>
          <nvpair id="kvm01_ilo-instance_attributes-ipaddr" name="ipaddr" value="10.0.1.40"/>
          <nvpair id="kvm01_ilo-instance_attributes-login" name="login" value="Administrator"/>
          <nvpair id="kvm01_ilo-instance_attributes-passwd" name="passwd" value="iloadmin"/>
          <nvpair id="kvm01_ilo-instance_attributes-action" name="action" value="reboot"/>
          <nvpair id="kvm01_ilo-instance_attributes-secure" name="secure" value="1"/>
          <nvpair id="kvm01_ilo-instance_attributes-delay" name="delay" value="15"/>
        </instance_attributes>
        <operations>
          <op id="kvm01_ilo-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
        <meta_attributes id="kvm01_ilo-meta_attributes">
          <nvpair id="kvm01_ilo-meta_attributes-target-role" name="target-role" value="Stopped"/>
        </meta_attributes>
      </primitive>
      <primitive class="stonith" id="kvm02_ilo" type="fence_ilo4_ssh">
        <instance_attributes id="kvm02_ilo-instance_attributes">
          <nvpair id="kvm02_ilo-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="kvm02"/>
          <nvpair id="kvm02_ilo-instance_attributes-ipaddr" name="ipaddr" value="10.0.1.41"/>
          <nvpair id="kvm02_ilo-instance_attributes-login" name="login" value="Administrator"/>
          <nvpair id="kvm02_ilo-instance_attributes-passwd" name="passwd" value="iloadmin"/>
          <nvpair id="kvm02_ilo-instance_attributes-action" name="action" value="reboot"/>
          <nvpair id="kvm02_ilo-instance_attributes-secure" name="secure" value="1"/>
          <nvpair id="kvm02_ilo-instance_attributes-delay" name="delay" value="15"/>
        </instance_attributes>
        <operations>
          <op id="kvm02_ilo-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
        <meta_attributes id="kvm02_ilo-meta_attributes">
          <nvpair id="kvm02_ilo-meta_attributes-target-role" name="target-role" value="Stopped"/>
        </meta_attributes>
      </primitive>
      <primitive class="ocf" id="ClusterIP" provider="heartbeat" type="IPaddr2">
        <instance_attributes id="ClusterIP-instance_attributes">
          <nvpair id="ClusterIP-instance_attributes-ip" name="ip" value="10.0.1.10"/>
          <nvpair id="ClusterIP-instance_attributes-cidr_netmask" name="cidr_netmask" value="32"/>
        </instance_attributes>
        <operations>
          <op id="ClusterIP-start-interval-0s" interval="0s" name="start" timeout="20s"/>
          <op id="ClusterIP-stop-interval-0s" interval="0s" name="stop" timeout="20s"/>
          <op id="ClusterIP-monitor-interval-30s" interval="30s" name="monitor"/>
        </operations>
      </primitive>
      <primitive class="ocf" id="BAK01" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="BAK01-instance_attributes">
          <nvpair id="BAK01-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="BAK01-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/bak01.xml"/>
          <nvpair id="BAK01-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="BAK01-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="BAK01-meta_attributes">
          <nvpair id="BAK01-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="BAK01-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="BAK01-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="BAK01-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="BAK01-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="BAK01-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="BAK01-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="BAK01-utilization">
          <nvpair id="BAK01-utilization-cpu" name="cpu" value="1"/>
          <nvpair id="BAK01-utilization-hv_memory" name="hv_memory" value="2048"/>
        </utilization>
      </primitive>
      <primitive class="ocf" id="CMS01" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="CMS01-instance_attributes">
          <nvpair id="CMS01-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="CMS01-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/cms01.xml"/>
          <nvpair id="CMS01-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="CMS01-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="CMS01-meta_attributes">
          <nvpair id="CMS01-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="CMS01-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="CMS01-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="CMS01-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="CMS01-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="CMS01-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="CMS01-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="CMS01-utilization">
          <nvpair id="CMS01-utilization-cpu" name="cpu" value="4"/>
          <nvpair id="CMS01-utilization-hv_memory" name="hv_memory" value="32768"/>
        </utilization>
      </primitive>
      <primitive class="ocf" id="ELK01" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="ELK01-instance_attributes">
          <nvpair id="ELK01-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="ELK01-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/elk01.xml"/>
          <nvpair id="ELK01-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="ELK01-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="ELK01-meta_attributes">
          <nvpair id="ELK01-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="ELK01-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="ELK01-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="ELK01-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="ELK01-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="ELK01-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="ELK01-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="ELK01-utilization">
          <nvpair id="ELK01-utilization-cpu" name="cpu" value="4"/>
          <nvpair id="ELK01-utilization-hv_memory" name="hv_memory" value="32768"/>
        </utilization>
      </primitive>
      <primitive class="ocf" id="ELK02" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="ELK02-instance_attributes">
          <nvpair id="ELK02-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="ELK02-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/elk02.xml"/>
          <nvpair id="ELK02-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="ELK02-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="ELK02-meta_attributes">
          <nvpair id="ELK02-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="ELK02-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="ELK02-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="ELK02-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="ELK02-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="ELK02-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="ELK02-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="ELK02-utilization">
          <nvpair id="ELK02-utilization-cpu" name="cpu" value="4"/>
          <nvpair id="ELK02-utilization-hv_memory" name="hv_memory" value="32768"/>
        </utilization>
      </primitive>
      <primitive class="ocf" id="ELK03" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="ELK03-instance_attributes">
          <nvpair id="ELK03-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="ELK03-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/elk03.xml"/>
          <nvpair id="ELK03-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="ELK03-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="ELK03-meta_attributes">
          <nvpair id="ELK03-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="ELK03-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="ELK03-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="ELK03-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="ELK03-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="ELK03-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="ELK03-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="ELK03-utilization">
          <nvpair id="ELK03-utilization-cpu" name="cpu" value="4"/>
          <nvpair id="ELK03-utilization-hv_memory" name="hv_memory" value="32768"/>
        </utilization>
      </primitive>
      <primitive class="ocf" id="IPA01" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="IPA01-instance_attributes">
          <nvpair id="IPA01-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="IPA01-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/ipa01.xml"/>
          <nvpair id="IPA01-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="IPA01-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="IPA01-meta_attributes">
          <nvpair id="IPA01-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="IPA01-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="IPA01-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="IPA01-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="IPA01-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="IPA01-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="IPA01-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="IPA01-utilization">
          <nvpair id="IPA01-utilization-cpu" name="cpu" value="1"/>
          <nvpair id="IPA01-utilization-hv_memory" name="hv_memory" value="3072"/>
        </utilization>
      </primitive>
      <primitive class="ocf" id="IPA02" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="IPA02-instance_attributes">
          <nvpair id="IPA02-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="IPA02-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/ipa02.xml"/>
          <nvpair id="IPA02-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="IPA02-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="IPA02-meta_attributes">
          <nvpair id="IPA02-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="IPA02-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="IPA02-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="IPA02-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="IPA02-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="IPA02-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="IPA02-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="IPA02-utilization">
          <nvpair id="IPA02-utilization-cpu" name="cpu" value="1"/>
          <nvpair id="IPA02-utilization-hv_memory" name="hv_memory" value="3072"/>
        </utilization>
      </primitive>
      <primitive class="ocf" id="PXY01" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="PXY01-instance_attributes">
          <nvpair id="PXY01-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="PXY01-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/pxy01.xml"/>
          <nvpair id="PXY01-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="PXY01-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="PXY01-meta_attributes">
          <nvpair id="PXY01-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="PXY01-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="PXY01-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="PXY01-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="PXY01-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="PXY01-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="PXY01-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="PXY01-utilization">
          <nvpair id="PXY01-utilization-cpu" name="cpu" value="1"/>
          <nvpair id="PXY01-utilization-hv_memory" name="hv_memory" value="2048"/>
        </utilization>
      </primitive>
      <primitive class="ocf" id="WIK01" provider="heartbeat" type="VirtualDomain">
        <instance_attributes id="WIK01-instance_attributes">
          <nvpair id="WIK01-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/>
          <nvpair id="WIK01-instance_attributes-config" name="config" value="/shared/vms/qemu_configs/wik01.xml"/>
          <nvpair id="WIK01-instance_attributes-migration_transport" name="migration_transport" value="ssh"/>
          <nvpair id="WIK01-instance_attributes-migrate_options" name="migrate_options" value="--p2p --tunnelled"/>
        </instance_attributes>
        <meta_attributes id="WIK01-meta_attributes">
          <nvpair id="WIK01-meta_attributes-allow-migrate" name="allow-migrate" value="true"/>
          <nvpair id="WIK01-meta_attributes-priority" name="priority" value="100"/>
        </meta_attributes>
        <operations>
          <op id="WIK01-start-interval-0s" interval="0s" name="start" timeout="120s"/>
          <op id="WIK01-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
          <op id="WIK01-monitor-interval-10" interval="10" name="monitor" timeout="30"/>
          <op id="WIK01-migrate_from-interval-0" interval="0" name="migrate_from" timeout="120s"/>
          <op id="WIK01-migrate_to-interval-0" interval="0" name="migrate_to" timeout="120s"/>
        </operations>
        <utilization id="WIK01-utilization">
          <nvpair id="WIK01-utilization-cpu" name="cpu" value="1"/>
          <nvpair id="WIK01-utilization-hv_memory" name="hv_memory" value="2048"/>
        </utilization>
      </primitive>
      <clone id="dlm-clone">
        <primitive class="ocf" id="dlm" provider="pacemaker" type="controld">
          <instance_attributes id="dlm-instance_attributes"/>
          <operations>
            <op id="dlm-start-interval-0s" interval="0s" name="start" timeout="90"/>
            <op id="dlm-stop-interval-0s" interval="0s" name="stop" timeout="100"/>
            <op id="dlm-monitor-interval-30s" interval="30s" name="monitor" on-fail="fence"/>
          </operations>
        </primitive>
        <meta_attributes id="dlm-clone-meta_attributes">
          <nvpair id="dlm-clone-meta_attributes-interleave" name="interleave" value="true"/>
          <nvpair id="dlm-clone-meta_attributes-ordered" name="ordered" value="true"/>
        </meta_attributes>
      </clone>
      <clone id="clvmd-clone">
        <primitive class="ocf" id="clvmd" provider="heartbeat" type="clvm">
          <instance_attributes id="clvmd-instance_attributes"/>
          <operations>
            <op id="clvmd-start-interval-0s" interval="0s" name="start" timeout="90"/>
            <op id="clvmd-stop-interval-0s" interval="0s" name="stop" timeout="90"/>
            <op id="clvmd-monitor-interval-30s" interval="30s" name="monitor" on-fail="fence"/>
          </operations>
        </primitive>
        <meta_attributes id="clvmd-clone-meta_attributes">
          <nvpair id="clvmd-clone-meta_attributes-interleave" name="interleave" value="true"/>
          <nvpair id="clvmd-clone-meta_attributes-ordered" name="ordered" value="true"/>
        </meta_attributes>
      </clone>
      <clone id="clusterfs_vms-clone">
        <primitive class="ocf" id="clusterfs_vms" provider="heartbeat" type="Filesystem">
          <instance_attributes id="clusterfs_vms-instance_attributes">
            <nvpair id="clusterfs_vms-instance_attributes-device" name="device" value="/dev/cluster_vg_vms/cluster_lv_vms"/>
            <nvpair id="clusterfs_vms-instance_attributes-directory" name="directory" value="/shared/vms"/>
            <nvpair id="clusterfs_vms-instance_attributes-fstype" name="fstype" value="gfs2"/>
            <nvpair id="clusterfs_vms-instance_attributes-options" name="options" value="noatime"/>
          </instance_attributes>
          <operations>
            <op id="clusterfs_vms-start-interval-0s" interval="0s" name="start" timeout="60"/>
            <op id="clusterfs_vms-stop-interval-0s" interval="0s" name="stop" timeout="60"/>
            <op id="clusterfs_vms-monitor-interval-10s" interval="10s" name="monitor" on-fail="fence"/>
          </operations>
        </primitive>
        <meta_attributes id="clusterfs_vms-clone-meta_attributes">
          <nvpair id="clusterfs_vms-clone-meta_attributes-interleave" name="interleave" value="true"/>
        </meta_attributes>
      </clone>
      <clone id="clusterfs_logs-clone">
        <primitive class="ocf" id="clusterfs_logs" provider="heartbeat" type="Filesystem">
          <instance_attributes id="clusterfs_logs-instance_attributes">
            <nvpair id="clusterfs_logs-instance_attributes-device" name="device" value="/dev/cluster_vg_logs/cluster_lv_logs"/>
            <nvpair id="clusterfs_logs-instance_attributes-directory" name="directory" value="/shared/logs"/>
            <nvpair id="clusterfs_logs-instance_attributes-fstype" name="fstype" value="gfs2"/>
            <nvpair id="clusterfs_logs-instance_attributes-options" name="options" value="noatime"/>
          </instance_attributes>
          <operations>
            <op id="clusterfs_logs-start-interval-0s" interval="0s" name="start" timeout="60"/>
            <op id="clusterfs_logs-stop-interval-0s" interval="0s" name="stop" timeout="60"/>
            <op id="clusterfs_logs-monitor-interval-10s" interval="10s" name="monitor" on-fail="fence"/>
          </operations>
        </primitive>
        <meta_attributes id="clusterfs_logs-clone-meta_attributes">
          <nvpair id="clusterfs_logs-clone-meta_attributes-interleave" name="interleave" value="true"/>
        </meta_attributes>
      </clone>
      <clone id="clusterfs_backups-clone">
        <primitive class="ocf" id="clusterfs_backups" provider="heartbeat" type="Filesystem">
          <instance_attributes id="clusterfs_backups-instance_attributes">
            <nvpair id="clusterfs_backups-instance_attributes-device" name="device" value="/dev/cluster_vg_backups/cluster_lv_backups"/>
            <nvpair id="clusterfs_backups-instance_attributes-directory" name="directory" value="/shared/backups"/>
            <nvpair id="clusterfs_backups-instance_attributes-fstype" name="fstype" value="gfs2"/>
            <nvpair id="clusterfs_backups-instance_attributes-options" name="options" value="noatime"/>
          </instance_attributes>
          <operations>
            <op id="clusterfs_backups-start-interval-0s" interval="0s" name="start" timeout="60"/>
            <op id="clusterfs_backups-stop-interval-0s" interval="0s" name="stop" timeout="60"/>
            <op id="clusterfs_backups-monitor-interval-10s" interval="10s" name="monitor" on-fail="fence"/>
          </operations>
        </primitive>
        <meta_attributes id="clusterfs_backups-clone-meta_attributes">
          <nvpair id="clusterfs_backups-clone-meta_attributes-interleave" name="interleave" value="true"/>
        </meta_attributes>
      </clone>
    </resources>
    <constraints>
      <rsc_location id="location-IPA01-kvm01-INFINITY" node="kvm01" rsc="IPA01" score="INFINITY"/>
      <rsc_location id="location-IPA02-kvm02-INFINITY" node="kvm02" rsc="IPA02" score="INFINITY"/>
      <rsc_location id="location-ELK01-kvm01-INFINITY" node="kvm01" rsc="ELK01" score="INFINITY"/>
      <rsc_location id="location-ELK02-kvm02-INFINITY" node="kvm02" rsc="ELK02" score="INFINITY"/>
      <rsc_location id="location-ELK03-kvm01-INFINITY" node="kvm01" rsc="ELK03" score="INFINITY"/>
      <rsc_location id="cli-prefer-PXY01" node="kvm02" role="Started" rsc="PXY01" score="INFINITY"/>
      <rsc_location id="cli-prefer-WIK01" node="kvm02" role="Started" rsc="WIK01" score="INFINITY"/>
      <rsc_order first="dlm-clone" first-action="start" id="order-dlm-clone-clvmd-clone-mandatory" then="clvmd-clone" then-action="start"/>
      <rsc_order first="clvmd-clone" first-action="stop" id="order-clvmd-clone-dlm-clone-mandatory" then="dlm-clone" then-action="stop"/>
      <rsc_colocation id="colocation-clvmd-clone-dlm-clone-INFINITY" rsc="clvmd-clone" score="INFINITY" with-rsc="dlm-clone"/>
      <rsc_order first="clvmd-clone" first-action="start" id="order-clvmd-clone-clusterfs_vms-clone-mandatory" then="clusterfs_vms-clone" then-action="start"/>
      <rsc_order first="clusterfs_vms-clone" first-action="stop" id="order-clusterfs_vms-clone-clvmd-clone-mandatory" then="clvmd-clone" then-action="stop"/>
      <rsc_colocation id="colocation-clusterfs_vms-clone-clvmd-clone-INFINITY" rsc="clusterfs_vms-clone" score="INFINITY" with-rsc="clvmd-clone"/>
      <rsc_order first="clusterfs_vms-clone" first-action="start" id="order-clusterfs_vms-clone-clusterfs_logs-clone-mandatory" then="clusterfs_logs-clone" then-action="start"/>
      <rsc_order first="clusterfs_logs-clone" first-action="stop" id="order-clusterfs_logs-clone-clusterfs_vms-clone-mandatory" then="clusterfs_vms-clone" then-action="stop"/>
      <rsc_colocation id="colocation-clusterfs_logs-clone-clvmd-clone-INFINITY" rsc="clusterfs_logs-clone" score="INFINITY" with-rsc="clvmd-clone"/>
      <rsc_order first="clusterfs_logs-clone" first-action="start" id="order-clusterfs_logs-clone-clusterfs_backups-clone-mandatory" then="clusterfs_backups-clone" then-action="start"/>
      <rsc_order first="clusterfs_backups-clone" first-action="stop" id="order-clusterfs_backups-clone-clusterfs_logs-clone-mandatory" then="clusterfs_logs-clone" then-action="stop"/>
      <rsc_colocation id="colocation-clusterfs_backups-clone-clvmd-clone-INFINITY" rsc="clusterfs_backups-clone" score="INFINITY" with-rsc="clvmd-clone"/>
    </constraints>
    <fencing-topology>
      <fencing-level devices="kvm01_ilo" id="fl-kvm01-1" index="1" target="kvm01"/>
      <fencing-level devices="kvm02_ilo" id="fl-kvm02-1" index="1" target="kvm02"/>
    </fencing-topology>
  </configuration>
</cib>
user396032
  • 31
  • 1
  • 4
  • More information would be helpful. Could you please share your current cib? – Dok May 22 '17 at 23:15
  • My company policy does not allow me to post it. Is there anything from my cib that I can look for that would help you in helping me? – user396032 May 23 '17 at 14:41
  • That is very unfortunate. You cannot "sanitize" the configuration and post it? Perhaps just the constraints and VirtualDomain resource definitions would be sufficient. I am mostly curious if you have the allow-migrate=true meta option for the VirtualDomain resources and if your constraints are sane. – Dok May 23 '17 at 15:39
  • How do you ensure migration is complete? What do you do if migration fails? What if it fails to converge after a very long time? It's been a while, but I don't remember seeing anything like that in pacemaker's configuration. My point being - if you're doing VMs, it is better to use a VM dedicated HA system that takes care of these things out of the box. – dyasny May 23 '17 at 17:54
  • Yes, I do have allow-migrate=true. I can migrate a VM from one node to the other with no issue. The issue is as I put a node into standby. It seems that the filesystem clone is being shutdown before the VMs are able to gracefully migrate to the other node. Let me see what I can do about sanitizing the file and getting it past my supervisor. – user396032 May 23 '17 at 19:03
  • I added updated the original post with my cib. There is actually another VM on there called ELK04, but I had to remove its info to fit into the post. – user396032 May 23 '17 at 20:00

1 Answers1

0

Without being able to see your configuration, I can probably only point you to the sections of documentation that I think might be relevant...

Take a look into the interleave option for your cloned resources in the Pacemaker documentation. It sounds like this might be what you're after.

Copied from http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ch10s02s02.html :

interleave: Changes the behavior of ordering constraints (between clones/masters) so that copies of the first clone can start or stop as soon as the copy on the same node of the second clone has started or stopped (rather than waiting until every instance of the second clone has started or stopped). Allowed values: false, true. The default value is false.

Matt Kereczman
  • 1,899
  • 9
  • 12