Team,
I've been working on getting jigdo usable for Fedora Unity. In the process, I needed a way to have a single url to access rpms on public mirrors. Assume the requested file is foo.rpm What I have done so far is setup a rewrite map that does the following:
* Pull the mirrorlist while preserving the requesting IP as X-Fowarded-For (for geoIP) * Parse the mirrorlist * Random loop the mirrorlist and request the HEAD for foo.rpm from a given mirror, if not 404 continue * Redirect the request for foo.rpm to a public mirror (using 302) that has verified it has the file
Things I plan on doing:
* Cache results (assuming no DB backend, do RAM caching based on session; IP, remote mirror, etc) * Maybe rate public mirrors based on: o Number of missing files (404) o Latency (granted this is from the rewrite server) o Bitrate test (ran every hour or something, also from the rewrite server) * Setup a database that can be prepopulated with this data, potentially using data from mirrormanager
If needed, round robin would help keep things going.. or even just use pound (or something else) between two machines.
Any thoughts? I'd like to have Fedora Project to provide this feature. If not, I will be setting it up anyways.
Jonathan Steffan daMaestro
On Thu, Feb 22, 2007 at 02:23:57PM -0700, Jonathan Steffan wrote:
I've been working on getting jigdo usable for Fedora Unity. In the
process, I needed a way to have a single url to access rpms on public mirrors. Assume the requested file is foo.rpm What I have done so far is setup a rewrite map that does the following:
This will be great especially if we start creating jigdo templates as part of the Fedora release. I'm going to look at Pungi soon to see how feasible it is to get the jigdo-template command added right after the iso's are created.
Jonathan Steffan (jonathansteffan@gmail.com) said:
I've been working on getting jigdo usable for Fedora Unity. In the
process, I needed a way to have a single url to access rpms on public mirrors.
Why? Does jigdo only understand "I have file named x-y-z", as opposed to any particular URL to find it?
Bill
Bill Nottingham wrote:
Jonathan Steffan (jonathansteffan@gmail.com) said:
I've been working on getting jigdo usable for Fedora Unity. In the
process, I needed a way to have a single url to access rpms on public mirrors.
Why? Does jigdo only understand "I have file named x-y-z", as opposed to any particular URL to find it?
Bill
Take a look at a jigdo template. This is one from a 20070219 spin using revisor (which uses pungi.)
http://files.damaestro.us/FC-6-i386-DVD.iso.jigdo
Due to mirrors having different structure.. it makes more sense to not list every mirror for each part. It makes more sense to be geoIP aware and just have a single point of entry. Also, it makes most sense to only direct users to mirrors that have the parts. That jigdo file is usable.. if anyone wants to give it a try. Not to say it will always work, it is just a test. So those of you that just want a Re-Spin.. please don't use it.
Jonathan
On Thu, Feb 22, 2007 at 02:23:57PM -0700, Jonathan Steffan wrote:
Team,
I've been working on getting jigdo usable for Fedora Unity. In the
process, I needed a way to have a single url to access rpms on public mirrors.
Mirrormanager has code to do this right now if you want it. (Granted, it's not in use yet, but soon...). The URLs underneath /pub/.... on the mirrormanager URL provide the list of up-to-date mirrors that have that content. Right now it's all directory-based, but could easily be extended to be file-based. By that I mean:
http://admin.fedora.redhat.com/mirrormanager/pub/fedora/linux/core/6/i386/is...
will return the list of mirrors containing the content of pub/fedora/linux/core/6/i386/iso/ that's up-to-date.
Assume the requested file is foo.rpm What I have done so far is setup a rewrite map that does the following:
* Pull the mirrorlist while preserving the requesting IP as X-Fowarded-For (for geoIP) * Parse the mirrorlist * Random loop the mirrorlist and request the HEAD for foo.rpm from a given mirror, if not 404 continue
Is the code for this piece available? Does it use keepalives? :-) The keepalives are what tried to hack into the mirrormanager crawler earlie this week. (urlgrabber's keepalive.py is close to what I need, but I need it to do HEADs not GETs, so I was overriding various parts and it got messy in a hurry).
* Redirect the request for foo.rpm to a public mirror (using 302) that has verified it has the file
Things I plan on doing:
* Cache results (assuming no DB backend, do RAM caching based on session; IP, remote mirror, etc) * Maybe rate public mirrors based on: * Number of missing files (404) * Latency (granted this is from the rewrite server) * Bitrate test (ran every hour or something, also from the rewrite server) * Setup a database that can be prepopulated with this data, potentially using data from mirrormanager
:-)
If needed, round robin would help keep things going.. or even just use pound (or something else) between two machines.
Any thoughts? I'd like to have Fedora Project to provide this feature. If not, I will be setting it up anyways.
Jonathan Steffan daMaestro
if mirrormanager can provide what you need, I'm sure open to contributions (ideas and/or patches).
Thanks, Matt
infrastructure@lists.fedoraproject.org