kushal reported a new issue against the project: `atomic-wg` that you are following: `` During our Atomic tests, after disabling the chronyd service, we reboot the VM. After that the ssh serivce is not coming back on time. This VM has one vcpu in it.
Sometimes we saw behaviour on Vagrant libvirt (which boots with 2 VCPU(s)), and at least once with Vagratnt Virtualbox before.
One such example of failure: https://apps.fedoraproject.org/autocloud/jobs/2188/output#30
https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org/... is the thread in the mailing list on the same topic. ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
dustymabe added a new comment to an issue you are following: `` hey @kushal, is there any way at all to reproduce this issue outside of the test suite? I'm thinking this probably falls into one of a few different categories:
- This is easily reproducible with a specific VM libvirt configuration (i.e. the hardware presented to the VM) and thus should be easily reproducible anywhere. - This doesn't reproduce any other way than by running tunir. - This doesn't reproduce anywhere but the environment that is running the tunir tests.
Can you try to chase down to see if one of those statements is true? ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
roshi added a new comment to an issue you are following: `` I can say that with my ansible tests (run on an instance locally with testcloud) this issue does not happen. ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
trishnag added a new comment to an issue you are following: `` It worked fine for me. Tested with 1 vcpu and 2 vcpu(s). Unable to reproduce the issue. ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
mattdm added a new comment to an issue you are following: `` Kushal, were you able to get the system logs from a failed image? ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
kushal added a new comment to an issue you are following: ``
Kushal, were you able to get the system logs from a failed image?
Nope, but I managed to get the tests working on the same boxes if we wait much longer time for the ssh to come back.
So I pushed tunir-0.17.1 in production after testing on my servers, it has POLL directive, and we are polling for 300 seconds for the ssh service to come back. ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
mattdm added a new comment to an issue you are following: `` I'd like to look at the logs still. :) ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
ilmostro added a new comment to an issue you are following: ``
Finally managed to isolate the issue. If we boot the image with only one CPU, the error comes up. If we boot with 2 or more CPU(s), no issues at all. Now the question is if we should make local testing on Autocloud with 2 CPU(s) or get this issue fixed somehow? Kushal
Delayed success with 1 VCPU vs. "success" with 2 VCPUs supports theory regarding "entropy problem" in [mailing-list](https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org/...) ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
dustymabe added a new comment to an issue you are following: `` haven't seen this lately - let's close it and re-open if the issue pops back up ``
To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/232
The status of the issue: `ssh issue on libvirt based images` of project: `atomic-wg` has been updated to: Closed by dustymabe.
cloud@lists.stg.fedoraproject.org