与下面的剧本一致地看到这一点。
- hosts: all
gather_facts: no
become: yes
tasks:
- name: restart winbind
service:
name: winbind
state: restarted
- name: reset ssh connection since winbind is dumb and fails the job even though it succeeds
meta: reset_connection
我的 ansible 用户登录、重新启动服务,然后被 winbind 断开连接,导致 ansible 认为任务失败。
Feb 13 17:04:57 server1 sshd[30156]: pam_namespace(sshd:session): user unknown 'ansible-user'
Feb 13 17:04:57 server1 sshd[30156]: pam_unix(sshd:session): session closed for user ansible-user
Feb 13 17:04:57 server1 sshd[30156]: fatal: login_init_entry: Cannot find user "ansible-user"
Feb 13 17:04:57 server1 sshd[30163]: fatal: mm_request_send: write: Broken pipe
Feb 13 17:04:57 server1 sshd[30163]: fatal: mm_request_send: write: Broken pipe
我尝试过的事情:
- 为重启 winbind 任务添加等待。没用,它只是在设置的等待时间后断开连接。
- 添加元:reset_connection,但它在运行之前断开连接。
此外,当发生这种情况时,我仍使用正常 ID 登录服务器,因此只是 ansible-user 断开了连接。我的 ID 和 ansible-user ID 都是 AD 帐户。
我有另一个 sssd 的剧本,但我没有看到这种行为。它会重新启动服务,保持连接并在运行输出中显示更改。
谢谢!
答案1
找到了一种解决方法,即忽略致命的无法访问错误并暂停剧本运行。不确定这种方法是否能很好地扩展,所以如果有人有更好的方法,我愿意倾听。幸运的是,winbind 即将退出市场。
- hosts: all
gather_facts: no
ignore_unreachable: yes
become: yes
tasks:
- name: restart winbind
service:
name: winbind
state: restarted
- name: pause the playbook run to allow winbind to reconnect to AD
pause:
seconds: 8
- name: confirm winbind is running
service:
name: winbind
state: started