It dawns on me I never actually sent this to the list for comments and we haven't officially adopted it yet.
http://infrastructure.fedoraproject.org/csi/host-lifecycle-policy/en-US/html...
It's still in flux but give it a read over. For some stuff it seems like it adds more work (like post kickstart checklist) but really it just lists what needs to be done as part of bringing a host online. Backups and monitoring are things that often get forgotten about.
I'm working on a very similar doc for bringing services online.
-Mike
On Mon, Jan 18, 2010 at 03:15:20PM -0600, Mike McGrath wrote:
It dawns on me I never actually sent this to the list for comments and we haven't officially adopted it yet.
http://infrastructure.fedoraproject.org/csi/host-lifecycle-policy/en-US/html...
It's still in flux but give it a read over. For some stuff it seems like it adds more work (like post kickstart checklist) but really it just lists what needs to be done as part of bringing a host online. Backups and monitoring are things that often get forgotten about.
It looks good after quick glance. I didn't know about needs-restarting.py, awesome. I'm going to print out all of the CSI this week and start reviewing.
It also sounds like it could be nice to have all of the Package Integrity & Security Update Checks running on regular intervals by Zabbix? (eg: letting us know when security updates are available, when services need to be restarted, etc).
luke
On Mon, 18 Jan 2010, Luke Macken wrote:
On Mon, Jan 18, 2010 at 03:15:20PM -0600, Mike McGrath wrote:
It dawns on me I never actually sent this to the list for comments and we haven't officially adopted it yet.
http://infrastructure.fedoraproject.org/csi/host-lifecycle-policy/en-US/html...
It's still in flux but give it a read over. For some stuff it seems like it adds more work (like post kickstart checklist) but really it just lists what needs to be done as part of bringing a host online. Backups and monitoring are things that often get forgotten about.
It looks good after quick glance. I didn't know about needs-restarting.py, awesome. I'm going to print out all of the CSI this week and start reviewing.
It also sounds like it could be nice to have all of the Package Integrity & Security Update Checks running on regular intervals by Zabbix? (eg: letting us know when security updates are available, when services need to be restarted, etc).
Yeah, we don't have all of that stuff implemented yet but some of it is. I've been goign back and forth between having zabbix do it or having cron do it. The main reason I've been leaning towards cron is because it doesn't produce output on success, and can take a long time to run (I worry about zabbix timeouts)
-Mike
On 01/18/2010 09:15 PM, Mike McGrath wrote:
It dawns on me I never actually sent this to the list for comments and we haven't officially adopted it yet.
http://infrastructure.fedoraproject.org/csi/host-lifecycle-policy/en-US/html...
It's still in flux but give it a read over. For some stuff it seems like it adds more work (like post kickstart checklist) but really it just lists what needs to be done as part of bringing a host online. Backups and monitoring are things that often get forgotten about.
I'm working on a very similar doc for bringing services online.
-Mike
Great work Mike.
There's one thing I would like to share and recommend that is not documented there and that is to use different colored Ethernet cables for different purposes.
For instance we use..
Red for server management ports White for public network interface Blue for the backup network interface etc.
This is most triumph when you need to unplug and service a machine and plug it back in again and is much less work then labeling each cable which you will need to do if they are all the same color....
JBG
infrastructure@lists.fedoraproject.org