A few times in the last week we have hit a state where kojira on koji02 times out and doesn't run newrepo tasks for buildroots.
Restarting the httpd on koji01 seems to unstick it, but this is not a good even stop gap as we would then have to manually do that and people would get small windows when koji was down.
So, some investigation from koji developers at least for now the solution is to increase the ssl timeout. It's currently 60s, but I have increased it to 180 in a hotfix.
http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=ebb160dc... and http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=f99e19b0...
This is already applied on koji02 to get us out of an outage situation (no new buildroots means no new fedora), but with +1s, I will apply to koji01 as well and make sure the playbooks sync with the hosts
kevin
Retroactive +1
On 1 April 2015 at 11:35, Kevin Fenzi kevin@scrye.com wrote:
A few times in the last week we have hit a state where kojira on koji02 times out and doesn't run newrepo tasks for buildroots.
Restarting the httpd on koji01 seems to unstick it, but this is not a good even stop gap as we would then have to manually do that and people would get small windows when koji was down.
So, some investigation from koji developers at least for now the solution is to increase the ssl timeout. It's currently 60s, but I have increased it to 180 in a hotfix.
http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=ebb160dc... and
http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=f99e19b0...
This is already applied on koji02 to get us out of an outage situation (no new buildroots means no new fedora), but with +1s, I will apply to koji01 as well and make sure the playbooks sync with the hosts
kevin
infrastructure mailing list infrastructure@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/infrastructure
On Wed, 1 Apr 2015 11:35:32 -0600 Kevin Fenzi kevin@scrye.com wrote:
A few times in the last week we have hit a state where kojira on koji02 times out and doesn't run newrepo tasks for buildroots.
Restarting the httpd on koji01 seems to unstick it, but this is not a good even stop gap as we would then have to manually do that and people would get small windows when koji was down.
So, some investigation from koji developers at least for now the solution is to increase the ssl timeout. It's currently 60s, but I have increased it to 180 in a hotfix.
http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=ebb160dc... and http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=f99e19b0...
This is already applied on koji02 to get us out of an outage situation (no new buildroots means no new fedora), but with +1s, I will apply to koji01 as well and make sure the playbooks sync with the hosts
Seems to be a relatively low risk change to me
+1
Applied.
Note that we may roll out new packages with this change at some point so we don't have to worry about the hotfix messing things up.
kevin
infrastructure@lists.fedoraproject.org