如何跳过已经运行的异步任务?

如何跳过已经运行的异步任务?

(来源@Kerrick Staley)

我想创建一个 Ansible 剧本,其中包含一个异步任务,后跟一个 async_status 任务,以便我执行以下操作:

  1. 我在笔记本电脑上运行剧本。它在后台启动异步任务,并开始使用 async_status 任务对其进行轮询。

  2. 我重启了笔记本电脑。异步任务继续在服务器上运行。

  3. 我再次运行剧本。它识别出异步任务已在服务器上运行,并直接转到 async_status 任务,轮询在步骤 1 中启动的任务。

Ansible 支持这个吗?我该如何创建一个可以做到这一点的剧本?

答案1

在第一个 play 中异步启动进程,创建作业 ID 文件var/运行//jid,并将日志写入var/log/async.log. 在第二个场景中等待进程完成,写入日志并删除作业 ID 文件。

完整测试项目示例

shell> tree .
.
├── ansible.cfg
├── group_vars
│   └── all
│       └── async_common.yml
├── hosts
├── pb-start.yml
├── pb-status.yml
├── proc01.yml
└── var
    ├── log
    └── run

7 directories, 7 files
shell> cat group_vars/all/async_common.yml 
var_run: "{{ playbook_dir }}/var/run"
var_log: "{{ playbook_dir }}/var/log"
async_log: async.log
shel> cat hosts
test_11
test_13

声明进程。每台主机上将启动两个进程

shell> cat proc01.yml 
cmds:
  - cmd: sleep 30
    async: 45
  - cmd: sleep 45
    async: 60

启动进程、创建作业ID并写入日志

shell> cat pb-start.yml
- hosts: all

  vars:

    results_dict: "{{ dict(ansible_play_hosts|
                           zip(ansible_play_hosts|
                               map('extract', hostvars, 'async_results'))) }}"

  tasks:

    - name: Start processes
      block:

        - command: "{{ item.cmd }}"
          async: "{{ item.async }}"
          poll: 0
          loop: "{{ cmds }}"
          register: async_results
        - debug:
            var: async_results
          when: debug|d(false)|bool

    - name: Record processes and write log
      block:

        - file:
            state: directory
            path: "{{ item }}"
          loop:
            - "{{ var_run }}"
            - "{{ var_log }}"
        - file:
            state: directory
            path: "{{ var_run }}/{{ item }}"
          loop: "{{ ansible_play_hosts }}"

        - copy:
            dest: "{{ var_run }}/{{ item.0.key }}/{{ item.1.ansible_job_id }}"
            content: "{{ item.1.results_file }}"
          loop: "{{ results_dict|dict2items|subelements('value.results') }}"
          loop_control:
            label: "{{ item.0.key }} {{ item.1.ansible_job_id }}"

        - lineinfile:
            create: true
            dest: "{{ var_log }}/{{ async_log }}"
            line: >-
              {{ '%Y-%m-%d %H:%M:%S'|strftime }}
              [start]  {{ item.0.key }}
              s:{{ item.1.started }}
              f:{{ item.1.finished }}
              {{ item.1.ansible_job_id }}
          loop: "{{ results_dict|dict2items|subelements('value.results') }}"
          loop_control:
            label: "{{ item.0.key }} {{ item.1.ansible_job_id }}"

      run_once: true
      delegate_to: localhost

给出

shell> ansible-playbook -e @proc01.yml pb-start.yml

PLAY [all] ***********************************************************************************

TASK [command] *******************************************************************************
changed: [test_11] => (item={'cmd': 'sleep 30', 'async': 45})
changed: [test_13] => (item={'cmd': 'sleep 30', 'async': 45})
changed: [test_11] => (item={'cmd': 'sleep 45', 'async': 60})
changed: [test_13] => (item={'cmd': 'sleep 45', 'async': 60})

TASK [debug] *********************************************************************************
skipping: [test_11]
skipping: [test_13]

TASK [file] **********************************************************************************
ok: [test_11 -> localhost] => (item=/export/scratch/tmp7/test-265/var/run)
ok: [test_11 -> localhost] => (item=/export/scratch/tmp7/test-265/var/log)

TASK [file] **********************************************************************************
ok: [test_11 -> localhost] => (item=test_11)
ok: [test_11 -> localhost] => (item=test_13)

TASK [copy] **********************************************************************************
changed: [test_11 -> localhost] => (item=test_11 571415057700.84160)
changed: [test_11 -> localhost] => (item=test_11 924759903126.84193)
changed: [test_11 -> localhost] => (item=test_13 551498199552.84159)
changed: [test_11 -> localhost] => (item=test_13 976946831378.84194)

TASK [lineinfile] ****************************************************************************
changed: [test_11 -> localhost] => (item=test_11 571415057700.84160)
changed: [test_11 -> localhost] => (item=test_11 924759903126.84193)
changed: [test_11 -> localhost] => (item=test_13 551498199552.84159)
changed: [test_11 -> localhost] => (item=test_13 976946831378.84194)

PLAY RECAP ***********************************************************************************
test_11: ok=5    changed=3    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   
test_13: ok=1    changed=1    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
shell> cat var/log/async.log 
2023-03-21 05:43:36 [start]  test_11 s:1 f:0 571415057700.84160
2023-03-21 05:43:36 [start]  test_11 s:1 f:0 924759903126.84193
2023-03-21 05:43:36 [start]  test_13 s:1 f:0 551498199552.84159
2023-03-21 05:43:37 [start]  test_13 s:1 f:0 976946831378.84194
shell> tree var/run/
var/run/
├── test_11
│   ├── 571415057700.84160
│   └── 924759903126.84193
└── test_13
    ├── 551498199552.84159
    └── 976946831378.84194

2 directories, 4 files

等待进程完成,删除作业 ID,并写入日志

shell> cat pb-status.yml
- hosts: localhost

  vars:

    procs_cmd: "cd {{ var_run }}; ls -1 *"
    procs_dict: "{{ dict(lookup('ansible.builtin.pipe', procs_cmd)|
                         community.general.jc('ls')|
                         groupby('parent')) }}"

# Optionally, select and/or deny job IDs
#                        selectattr('filename', 'in' , jid_allow|d([]))|
#                        rejectattr('filename', 'in' , jid_deny|d([]))|

  tasks:
    - debug:
        var: procs_dict

    - name: Wait for processes to finish and write log
      block:

        - debug:
            msg: |
              host: {{ item.0.key }} jid: {{ item.1.filename }}
          loop: "{{ procs_dict|dict2items|subelements('value') }}"
          loop_control:
            label: "{{ item.0.key }}"
          when: debug|d(false)|bool

        - async_status:
            jid: "{{ item.1.filename }}"
          loop: "{{ procs_dict|dict2items|subelements('value') }}"
          loop_control:
            label: "{{ item.0.key }}"
          delegate_to: "{{ item.0.key }}"
          register: async_poll
          until: async_poll.finished
          retries: 999
        - debug:
            var: async_poll
          when: debug|d(false)|bool

        - lineinfile:
            create: true
            dest: "{{ var_log }}/{{ async_log }}"
            line: >-
              {{ '%Y-%m-%d %H:%M:%S'|strftime }}
              [status] {{ item.item.0.key }}
              s:{{ item.started }}
              f:{{ item.finished }}
              {{ item.ansible_job_id }}
          loop: "{{ async_poll.results }}"
          loop_control:
            label: "{{ item.item.1.parent }} {{ item.item.1.filename }}"

    - name: Remove finished processes and write log
      block:

        - file:
            state: absent
            path: "{{ var_run }}/{{ item.item.0.key }}/{{ item.ansible_job_id }}"
          loop: "{{ async_poll.results|selectattr('finished', 'eq', 1) }}"
          loop_control:
            label: "{{ item.item.1.parent }} {{ item.item.1.filename }}"

        - lineinfile:
            create: true
            dest: "{{ var_log }}/{{ async_log }}"
            line: >-
              {{ '%Y-%m-%d %H:%M:%S'|strftime }}
              [remove] {{ item.item.0.key }}
              s:{{ item.started }}
              f:{{ item.finished }}
              {{ item.ansible_job_id }}
          loop: "{{ async_poll.results|selectattr('finished', 'eq', 1) }}"
          loop_control:
            label: "{{ item.item.1.parent }} {{ item.item.1.filename }}"

      when: remove_finished|d(false)|bool

给出

shell> ansible-playbook pb-status.yml -e remove_finished=true

PLAY [localhost] *****************************************************************************

TASK [debug] *********************************************************************************
skipping: [localhost] => (item=test_11) 
skipping: [localhost] => (item=test_11) 
skipping: [localhost] => (item=test_13) 
skipping: [localhost] => (item=test_13) 
skipping: [localhost]

TASK [async_status] **************************************************************************
changed: [localhost -> test_11] => (item=test_11)
changed: [localhost -> test_11] => (item=test_11)
changed: [localhost -> test_13] => (item=test_13)
changed: [localhost -> test_13] => (item=test_13)

TASK [debug] *********************************************************************************
skipping: [localhost]

TASK [lineinfile] ****************************************************************************
changed: [localhost] => (item=test_11 571415057700.84160)
changed: [localhost] => (item=test_11 924759903126.84193)
changed: [localhost] => (item=test_13 551498199552.84159)
changed: [localhost] => (item=test_13 976946831378.84194)

TASK [file] **********************************************************************************
changed: [localhost] => (item=test_11 571415057700.84160)
changed: [localhost] => (item=test_11 924759903126.84193)
changed: [localhost] => (item=test_13 551498199552.84159)
changed: [localhost] => (item=test_13 976946831378.84194)

TASK [lineinfile] ****************************************************************************
changed: [localhost] => (item=test_11 571415057700.84160)
changed: [localhost] => (item=test_11 924759903126.84193)
changed: [localhost] => (item=test_13 551498199552.84159)
changed: [localhost] => (item=test_13 976946831378.84194)

PLAY RECAP ***********************************************************************************
localhost: ok=4    changed=4    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
shell> cat var/log/async.log 
2023-03-21 05:43:36 [start]  test_11 s:1 f:0 571415057700.84160
2023-03-21 05:43:36 [start]  test_11 s:1 f:0 924759903126.84193
2023-03-21 05:43:36 [start]  test_13 s:1 f:0 551498199552.84159
2023-03-21 05:43:37 [start]  test_13 s:1 f:0 976946831378.84194
2023-03-21 05:46:58 [status] test_11 s:1 f:1 571415057700.84160
2023-03-21 05:46:58 [status] test_11 s:1 f:1 924759903126.84193
2023-03-21 05:46:58 [status] test_13 s:1 f:1 551498199552.84159
2023-03-21 05:46:58 [status] test_13 s:1 f:1 976946831378.84194
2023-03-21 05:47:00 [remove] test_11 s:1 f:1 571415057700.84160
2023-03-21 05:47:00 [remove] test_11 s:1 f:1 924759903126.84193
2023-03-21 05:47:00 [remove] test_13 s:1 f:1 551498199552.84159
2023-03-21 05:47:00 [remove] test_13 s:1 f:1 976946831378.84194
shell> tree var/run/
var/run/
├── test_11
└── test_13

2 directories, 0 files

您可以有选择地拒绝 ID。例如,创建一个列表

shell> cat jid_deny.yml 
jid_deny:
  - '424790592066.84404'
  - '828365727638.84439'

并更新字典的声明

    procs_dict: "{{ dict(lookup('ansible.builtin.pipe', procs_cmd)|
                         community.general.jc('ls')|
                         rejectattr('filename', 'in' , jid_deny|d([]))|
                         groupby('parent')) }}"

然后在运行字符串中使用该列表

shell> ansible-playbook pb-status.yml -e remove_finished=true -e @jid_deny.yml

...

以同样的方式,你可以有选择地允许 ID。创建列表jid_allow并更新声明

    procs_dict: "{{ dict(lookup('ansible.builtin.pipe', procs_cmd)|
                         community.general.jc('ls')|
                         selectattr('filename', 'in' , jid_allow|d([]))|
                         rejectattr('filename', 'in' , jid_deny|d([]))|
                         groupby('parent')) }}"

答案2

如果没有一些代码和有关您的环境的更多信息,这有点难以说,但假设远程主机是一个现代 Linux 发行版,我会考虑让 systemd 为我完成大部分工作。

看一下一次性服务:

https://www.redhat.com/sysadmin/systemd-oneshot-service

Ansible 可以只完成管理服务的工作,并异步检查其状态。

相关内容