Just a quick thought after tonight's outage.
If *everything* is down, the most important thing to have up is the mirrormanager since all of our users depend on it on a regular basis.
It might be a relatively easy thing to periodically replicate the entire mirrormanager database over to an external box (like @ serverbeach) so we can repoint DNS and bring it back online at moment's notice.
Just a thought.
Warren Togami wtogami@redhat.com
Warren Togami wrote:
Just a quick thought after tonight's outage.
If *everything* is down, the most important thing to have up is the mirrormanager since all of our users depend on it on a regular basis.
It might be a relatively easy thing to periodically replicate the entire mirrormanager database over to an external box (like @ serverbeach) so we can repoint DNS and bring it back online at moment's notice.
+1
Warren Togami wrote:
Just a quick thought after tonight's outage.
If *everything* is down, the most important thing to have up is the mirrormanager since all of our users depend on it on a regular basis.
Mirrormanager is perfectly designed in this respect. Long story short, everything that has anything to do with distribution needs to not be in a Red Hat colo. Certainly not as the SPOF. This will require a few external pieces (including a backup vpn server, which openVPN makes pretty easy to do).
-Mike
On Thu, 2007-11-08 at 21:52 -0600, Mike McGrath wrote:
Warren Togami wrote:
Just a quick thought after tonight's outage.
If *everything* is down, the most important thing to have up is the mirrormanager since all of our users depend on it on a regular basis.
Mirrormanager is perfectly designed in this respect. Long story short, everything that has anything to do with distribution needs to not be in a Red Hat colo. Certainly not as the SPOF. This will require a few external pieces (including a backup vpn server, which openVPN makes pretty easy to do).
The infrastructure we need to support:
build: koji, plague, local mirrors for building, cvs, bodhi, pkgdb
distribution: mirrormanager, mirrormaster, staging, torrent
primary site: fp.o, wiki, docs, start.fp.o
meta services all of the above will or do require: vpn, fas[2], puppet, infrastructure-cvs, dns, mail
Does that sound right?
-sv
On Thu, 2007-11-08 at 22:57 -0500, seth vidal wrote:
On Thu, 2007-11-08 at 21:52 -0600, Mike McGrath wrote:
Warren Togami wrote:
Just a quick thought after tonight's outage.
If *everything* is down, the most important thing to have up is the mirrormanager since all of our users depend on it on a regular basis.
Mirrormanager is perfectly designed in this respect. Long story short, everything that has anything to do with distribution needs to not be in a Red Hat colo. Certainly not as the SPOF. This will require a few external pieces (including a backup vpn server, which openVPN makes pretty easy to do).
The infrastructure we need to support:
build: koji, plague, local mirrors for building, cvs, bodhi, pkgdb
distribution: mirrormanager, mirrormaster, staging, torrent
primary site: fp.o, wiki, docs, start.fp.o
meta services all of the above will or do require: vpn, fas[2], puppet, infrastructure-cvs, dns, mail
meta services additions: backups
and then mostly unrelated services: fedorapeople.org, planet.fedoraproject.org, hosted.fedoraproject.org
-sv
Part of the Primary site, will be hosted in Germany (right now Telia/ProIO are just waiting for our server). We still have 1/2 rack to use, and from my personal experience working with Telia and ProIO i don't remember having 1 "unscheduled" outage in that colo for the last year.
Paulo
On Nov 9, 2007 5:21 AM, seth vidal skvidal@fedoraproject.org wrote:
On Thu, 2007-11-08 at 22:57 -0500, seth vidal wrote:
On Thu, 2007-11-08 at 21:52 -0600, Mike McGrath wrote:
Warren Togami wrote:
Just a quick thought after tonight's outage.
If *everything* is down, the most important thing to have up is the mirrormanager since all of our users depend on it on a regular basis.
Mirrormanager is perfectly designed in this respect. Long story short, everything that has anything to do with distribution needs to not be in a Red Hat colo. Certainly not as the SPOF. This will require a few external pieces (including a backup vpn server, which openVPN makes pretty easy to do).
The infrastructure we need to support:
build: koji, plague, local mirrors for building, cvs, bodhi, pkgdb
distribution: mirrormanager, mirrormaster, staging, torrent
primary site: fp.o, wiki, docs, start.fp.o
meta services all of the above will or do require: vpn, fas[2], puppet, infrastructure-cvs, dns, mail
meta services additions: backups
and then mostly unrelated services: fedorapeople.org, planet.fedoraproject.org, hosted.fedoraproject.org
-sv
Fedora-infrastructure-list mailing list Fedora-infrastructure-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list
On Fri, Nov 09, 2007 at 09:16:05AM +0100, Paulo Santos wrote:
Part of the Primary site, will be hosted in Germany (right now Telia/ProIO are just waiting for our server). We still have 1/2 rack to use, and from my personal experience working with Telia and ProIO i don't remember having 1 "unscheduled" outage in that colo for the last year.
Seth said he'd get additional app servers put together (starting today) at serverbeach. Once we've got the additional equipment in Germany adding high-priority app servers there too would be very good.
seth vidal (skvidal@fedoraproject.org) said:
build: koji, plague, local mirrors for building, cvs, bodhi, pkgdb
distribution: mirrormanager, mirrormaster, staging, torrent
primary site: fp.o, wiki, docs, start.fp.o
meta services all of the above will or do require: vpn, fas[2], puppet, infrastructure-cvs, dns, mail
meta services additions: backups
and then mostly unrelated services: fedorapeople.org, planet.fedoraproject.org, hosted.fedoraproject.org
If I was planning strategy, the priority for offsite/redundancy would be:
1) mirrormanager 2) torrent 3) start.fp.o, docs, fp.o 4) mirrormaster 5) hosted 6) wiki 7) fedorapeople/planet 8) build stuff
(meta-services sprinkled in as necessary)
Bill
On Fri, 2007-11-09 at 10:09 -0500, Bill Nottingham wrote:
seth vidal (skvidal@fedoraproject.org) said:
build: koji, plague, local mirrors for building, cvs, bodhi, pkgdb
distribution: mirrormanager, mirrormaster, staging, torrent
primary site: fp.o, wiki, docs, start.fp.o
meta services all of the above will or do require: vpn, fas[2], puppet, infrastructure-cvs, dns, mail
meta services additions: backups
and then mostly unrelated services: fedorapeople.org, planet.fedoraproject.org, hosted.fedoraproject.org
If I was planning strategy, the priority for offsite/redundancy would be:
- mirrormanager
- torrent
- start.fp.o, docs, fp.o
- mirrormaster
- hosted
- wiki
- fedorapeople/planet
- build stuff
(meta-services sprinkled in as necessary)
the importance order is something separate. I was just listing the service so we can create interdependency charts.
fo example: mirrormanager is useless w/o fas :)
-sv
On Fri, 09 Nov 2007 11:46:06 -0500 seth vidal skvidal@fedoraproject.org wrote:
fo example: mirrormanager is useless w/o fas :)
There are degrees of usefulness. Surely you don't need fas just to respond to mirrorlist cgi requests, or provide a listing of last known good mirrors for direct downloading...
On Fri, 2007-11-09 at 11:46 -0500, seth vidal wrote:
the importance order is something separate. I was just listing the service so we can create interdependency charts.
fo example: mirrormanager is useless w/o fas :)
The managemetn side of mirrormanager is. But the "serve out the existing mirror information" side is perfectly fine without FAS (and realistically, is the part that more of us think of when we think MM)
Jeremy
On Fri, 2007-11-09 at 12:16 -0500, Jeremy Katz wrote:
On Fri, 2007-11-09 at 11:46 -0500, seth vidal wrote:
the importance order is something separate. I was just listing the service so we can create interdependency charts.
fo example: mirrormanager is useless w/o fas :)
The managemetn side of mirrormanager is. But the "serve out the existing mirror information" side is perfectly fine without FAS (and realistically, is the part that more of us think of when we think MM)
in my mind: mirrormanger - the management side mirrorlist - the actual mirrorlists
-sv
On Fri, 9 Nov 2007, seth vidal wrote:
On Fri, 2007-11-09 at 12:16 -0500, Jeremy Katz wrote:
The managemetn side of mirrormanager is. But the "serve out the existing mirror information" side is perfectly fine without FAS (and realistically, is the part that more of us think of when we think MM)
in my mind: mirrormanger - the management side mirrorlist - the actual mirrorlists
Ditto. I do agree, though, that we could probably make mirrorlist redundant without necessarily making mirrormanager so, and that would fulfill most of our needs.
Jima
On Fri, Nov 09, 2007 at 01:09:30PM -0600, Jima wrote:
On Fri, 9 Nov 2007, seth vidal wrote:
On Fri, 2007-11-09 at 12:16 -0500, Jeremy Katz wrote:
The managemetn side of mirrormanager is. But the "serve out the existing mirror information" side is perfectly fine without FAS (and realistically, is the part that more of us think of when we think MM)
in my mind: mirrormanger - the management side mirrorlist - the actual mirrorlists
Ditto. I do agree, though, that we could probably make mirrorlist redundant without necessarily making mirrormanager so, and that would fulfill most of our needs.
Right. In practice, mirrorlist doesn't need regular access to the database, and in case we can't reach the database, can continue to offer slighly stale data (e.g. >1 hour old, which right now it never is). Which should be just fine.
Mike wanted me too stop doing queries on app4, and just pull the data that's queried from app3. Now that we're talking about more app servers for the mirrorlist, this makes even more sense, and I'll look to do this. We will need app3 to generate the data, and then copy it to the other app servers.
On Thu, 2007-11-08 at 22:57 -0500, seth vidal wrote:
On Thu, 2007-11-08 at 21:52 -0600, Mike McGrath wrote:
Warren Togami wrote:
Just a quick thought after tonight's outage.
If *everything* is down, the most important thing to have up is the mirrormanager since all of our users depend on it on a regular basis.
Mirrormanager is perfectly designed in this respect. Long story short, everything that has anything to do with distribution needs to not be in a Red Hat colo. Certainly not as the SPOF. This will require a few external pieces (including a backup vpn server, which openVPN makes pretty easy to do).
The infrastructure we need to support:
[snip]
Does that sound right?
That seems like a reasonable list. The other thing to keep in mind, though, is that some things can be lower on the list for trying to duplicate. eg, if the colo is down, the fact that a lot of the developer infrastructure is down is fairly acceptable I think.
Now if the colo is down because a hole opened in the earth and the DC has fallen to the center of the earth, then it's another question. But then we're talking substantial disaster recovery and not just avoiding SPOF for our users during colo downtimes
Jeremy
infrastructure@lists.fedoraproject.org