On Wed, 21 Mar 2012 11:03:24 -0400 seth vidal skvidal@fedoraproject.org wrote:
On Wed, 21 Mar 2012 08:33:51 -0600 Kevin Fenzi kevin@scrye.com wrote:
There's a few things we could do on fas load:
a) add more fas servers. b) reduce the number of runs. How often do we change someone in sysadmin-noc, sysadmin-main, sysadmin-build? c) move to a system where we only re-run fasClient when there is a change.
I'm thinking for the hosts which are sysadmin-ish only - do C.
for the publicish hosts continue to poll fas directly.
so:
- hosted, people, bastion, publictests == poll
- everything else is a set built and pushed to them.
Yeah, the trick is knowing when there is a change that affects them...
I wonder if we could make fas smarter. Have a serial # for each group. It pulls and keeps track of that. Then it pulls again but just asks "what serial # do you have for groups x, y, z". Probibly too much added complexity I guess.
I'd agree collectd off probibly. Or at least a seperate one if we needed to monitor them.
I'm not sure what benefit we get from collectd on transient builders, though.
On our long-running hosts I understand but not on the builders.
Yeah, the only case I can see is so we could see how loaded they are... and we might have better ways to tell that.
Yeah, we could hopefully have another network thats larger than /24 for the arm builders.
I can imagine various network changes should easily allow us to allocate larger than a /24 to the internal build network.
Yeah.
I'm sure some of this will be a process of 'oh no, what we have now doesn't scale, lets fix it'. Of course some of it we can get ready for up front too.
yay for planning! :)
Overall I like the idea of the automated builder re-install and think it will get us more ready for things like a large arm cluster.
Then I will get crackin' on making it work.
Sounds good.
kevin