On 27 June 2017 at 09:47, Miroslav Suchý msuchy@redhat.com wrote:
Dne 26.6.2017 v 18:50 Kevin Fenzi napsal(a):
Greetings.
I've seen some various retrace/faf issues of late, so I thought I would collect them into an email and see if you all could take a look and solve them. :)
Thank you for bringing it up.
- retrace02.qa.fedoraproject.org has a 100% full disk.
retrace02 is used just for staging/development. So not big issue. But I am working on it right now. Should be resolved by EOB. .... Resolved now. :)
Should we rename the system to be retrace01.stg.qa.fedoraproject.org? That way we can put problems on it as a lower priority from our point?
Second, who should we put on monitoring it and the other servers? I am updating the nagios so it can have more people aware of different classes of users.
- retrace01.qa.fedoraproject.org is almost constantly alerting on swap
being full. Not sure what to do about this, but perhaps we could add more swap or somehow limit it to use only memory for normal jobs?
Few months ago I set postgresql to use more agressive caching. So that is main culprint for consuming so much memory. I can easily lower it by few percent. But... I see right now that there is 16GB swap and 8 GB is free. And total available memory is 16 GB. Because 8GB free swap and 8GB are kernel buffers/cache. So when you see those errors and what are the exact numbers in those alerts?
I will investigate remaining issues tomorrow.
-- Miroslav Suchy, RHCA Red Hat, Senior Software Engineer, #brno, #devexp, #fedora-buildsys _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org