It happen again.But now I have more traces and hints.
This morning (9:33 UTC) I get Nagios alert that: WARN: datanommer has not seen a copr message in 6 hours, 10 minutes, 39 seconds which means that sometime between 3:30 UTC and 4:30 UTC something happen.
I logged to copr-be and to my surprise: ansible-playbook -vvvv -c ssh /home/copr/provision/builderpb.yml ERROR: debug is not a legal parameter in an Ansible task or handler without changing anything over night.
To my surprise I find that: rpm -V ansible ... missing /usr/share/ansible/utilities missing /usr/share/ansible/utilities/accelerate missing /usr/share/ansible/utilities/debug missing /usr/share/ansible/utilities/fail missing /usr/share/ansible/utilities/include_vars missing /usr/share/ansible/utilities/pause missing /usr/share/ansible/utilities/set_fact missing /usr/share/ansible/utilities/wait_for
I.e. Whole content of /usr/share/ansible/utilities is missing. I quickly reinstall ansible package and everything started working again.
Now I have to find the cause otherwise I expect that it happen again this night.
I checked syslog and only relevant informations are: 1) Feb 28 03:46:22 dhcp-client03 systemd[1]: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 24347 (find) Feb 28 03:46:22 dhcp-client03 systemd[1]: Mounting Arbitrary Executable File Formats File System... Feb 28 03:46:22 dhcp-client03 systemd[1]: Mounted Arbitrary Executable File Formats File System.
2) Feb 28 04:04:05 dhcp-client03 systemd-logind[291]: New session 24 of user root. Feb 28 04:04:05 dhcp-client03 ansible-yum: Invoked with CHECKMODE=True name=cloud-utils list=None disable_gpg_check=False conf_file=None state=present disablerepo=None enablerepo=None Feb 28 04:04:05 dhcp-client03 systemd-logind[291]: Removed session 24. Feb 28 04:04:05 dhcp-client03 systemd-logind[291]: New session 25 of user root. Feb 28 04:04:05 dhcp-client03 ansible-command: Invoked with executable=None shell=False args=growpart /dev/vda 2 removes=None creates=None chdir=None Feb 28 04:04:06 dhcp-client03 systemd-logind[291]: Removed session 25. Feb 28 04:04:06 dhcp-client03 systemd-logind[291]: New session 26 of user root. Feb 28 04:04:06 dhcp-client03 ansible-setup: Invoked with CHECKMODE=True filter=* fact_path=/etc/ansible/facts.d Feb 28 04:04:06 dhcp-client03 systemd-logind[291]: Removed session 26. Feb 28 04:04:07 dhcp-client03 systemd-logind[291]: New session 27 of user root. Feb 28 04:04:07 dhcp-client03 ansible-yum: Invoked with CHECKMODE=True name=fedmsg,libsemanage-python,python-psutil list=None disable_gpg_check=False conf_file=None state=installed disablerepo=None pkg=fedmsg,libsemanage-python,python-psutil enablerepo=None Feb 28 04:04:42 dhcp-client03 systemd-logind[291]: Removed session 27.
I am not sure about the first one.
The second one is some ansible playbook (can it be that nirik check of differences?) But I'm really clueless how it can remove /usr/share/ansible/utilities/* Does somebody have some idea?