0

I have an Ansible role that i call to return the following information on a network switch

    "networkSwitch": [
        {
            "switchName": "some_switch_name_01",
            "uplinks": [
                "physicalNic-nic0",
                "physicalNic-nic3"
            ],
            "uplink_count": 2
        }
    ]

I need to call the role until one of two conditions are met

  1. the switchName value matches an expected name
  2. the uplink_count is >= 1

Preferably both of the above would be true and this would break out of the loop successfully

I've discovered that it's not possible to use until/retry/delay with either include_role or retry_role.

I found Looping or repeating a group of tasks until success and attempted to follow it's instruction without success

I've created a file called get_switch_detail.yml It contains

---
- name: repeat until success
  block:
    - name: increment attempts counter
      set_fact:
        attempt_number: "{{ attempt_number | d(0) | int + 1 }}"

    - name: Get switch details 
      debug:
        msg: "This is a dummy task, it doesn't actually get the switch details"

    - name: Print the switch facts
      debug: var=item
      with_items: "{{switch_facts.results}}"
      when: item.ansible_facts is defined

    - name: set fact to collect specific Switch attributes
      set_fact:
        networkSwitch: "{{ networkSwitch | d([]) + [{ 'switchName': item.ansible_facts.config.network.proxySwitch[0].dvsName, 
                                                  'uplinks': item.ansible_facts.config.network.proxySwitch[0].pnic, 
                                                  'uplink_count': item.ansible_facts.config.network.proxySwitch[0].pnic|length }] }}"
      with_list: "{{lookup('list', switch_facts.results) }}"
      when: item.ansible_facts is defined

    - name: Print the proxySwitch attributes
      debug: 
        msg:
          - "proxySwitch attributes are: {{proxySwitch}}"
          - "proxySwitch name is: {{proxySwitch[0].dvsName}}"
          - "Number of proxySwitch uplinks is: {{proxySwitch[0].pnic_count}}"

    - name: Print
      debug: var=proxySwitch

    - name: Task to check for success
      debug:
        msg: success
      when: 
        - proxySwitch[0].switchName == target_switch_name
        - proxySwitch[0].uplink_count >= 2
  
  rescue:
    - name: "Fail if we reached the max of {{ max_attempts | d(3) }} attempts"
      # Default will be 3 attempts if max_attempts is not passed as a parameter
      ansible.builtin.fail:
        msg: Maximum number of attempts reached
      when: attempt_number | int == max_attempts | int | d(3)

    - ansible.builtin.debug:
        msg: "group of tasks failed on attempt {{ attempt_number }}. Retrying"

    - name: add delay if needed
      # no delay if retry_delay is not passed as parameter
      ansible.builtin.wait_for:
        timeout: "{{ retry_delay | int | d(omit) }}"
      when: retry_delay is defined

    # include ourselves to retry.
    - ansible.builtin.include_tasks: ../get_switch_detail.yml

I call the above using

    - name: Retrying
      include_tasks: ../get_switch_detail.yml
      vars:
        max_attempts: 4
        retry_delay: 5

but it never repeats

piercjs
  • 133
  • 8
  • 1
    At first glance, there are no tasks that actually fail in your block and would trigger the rescue. I suspect your last task in block should use `failed_when`. `when` just tells when to run/skip a task and does not trigger failure in itself. – Zeitounator Apr 26 '23 at 16:18

1 Answers1

0

Figured it out, probably not correct way of doing things but, in this scenario it gives the desired outcome.

In short, i'm calling the role from within itself when a condition is met. This potentially leads to an infinite loop but, was the best i could come up with.

Main play

- hosts: "{{targets}}"
  name: demo looping role
  gather_facts: false
  connection: local
  vars_files:
    - ../vars/some_vars.yml

  tasks:
    - name: First task, always performed
      debug:
        msg: "Current server being actioned: {{ inventory_hostname }}"
      when:
        - item.name is defined and item.name == inventory_hostname
      with_items: "{{ my_stuff }}"

    - name: Get switch Uplink details - recursive loop
      include_role:
        name: get_switch_uplinks
      when:
        - inventory_hostname == outer_item.name and outer_item.name is defined
      with_items: "{{my_stuff }}"
      loop_control:
        loop_var: outer_item

    - name: third task, only performed when condition in second task is true
      include_role:
        name: third_role
      when:
        - item.name is defined and item.name == inventory_hostname
      with_items: "{{ my_stuff }}"

The last task within the get_switch_uplinks role calls itself when conditions are right

---
- name: Get switch facts 
  private_module_name_to_do_something:
    switchname: "{{ item.switchname }}"
    username: "{{ username }}"
    password: "{{ password }}"
  no_log: false
  delegate_to: localhost
  register: switch_facts
  with_items: "{{ my_stuff }}"
  when:
    - inventory_hostname == item.name

- name: Set switch var to be empty
  set_fact: 
    switch: []

- name: set fact to collect specific switch attributes
  set_fact:
    switch: "{{ switch | d([]) + [{ 'switchName': item.ansible_facts.config.network.switch[0].switchName, 
                                              'uplinks': item.ansible_facts.config.network.switch[0].uplinks, 
                                              uplink_count': item.ansible_facts.config.network.switch[0].uplinks|length }] }}"
  with_list: "{{lookup('list', switch_facts.results) }}"
  when: item.ansible_facts is defined

- name: Print specific switch attributes
  debug: var=switch

- name: role imports itself when conditions not met 
  include_role:
    name: get_switch_uplinks
  when: (inner_item.switchName != target_switch_name) or (inner_item.uplink_count <= 0)
  with_list: "{{ lookup('list', switch) }}"
  loop_control:
    loop_var: inner_item

Again, if the conditions specified are not met, this causes an infinite loop. Definitely not the best solution.

In my scenario though, the conditions will be met, it's just that they are met on some devices earlier than others so, trying to use a static timer at the beginning of the third play task wasn't always successful

piercjs
  • 133
  • 8