https://fedorahosted.org/InstantMirror/
I have wrote down design notes of the new InstantMirror project. This project is meant to be fostered in Google Summer of Code 2009, with multiple Red Hat engineers as mentors, and possibly multiple students working together as a team. Students are to be judged by their ability to work with others as a team, their ability to implement the plan, and their ability to encourage participation from further volunteers interested in solving these problems.
Major Implications of InstantMirror
1. Instant changes on mirrors. From the perspective of users, changed files on the master can appear for download "instantly" on mirrors. 2. Read-only network filesystem that can replicates data in a torrent-like swarming manner, with snapshots and tags to preserve access to previous contents of the filesystem. This has the potential to be far more robust and efficient than rsync mirrors, while being more flexible than reverse caching proxy mirrors. 3. Local versioning backup filesystem. Simply turn off the networking and client-side cache and you have a local versioning filesystem with tags, useful for backups. Space efficiently preserves old versions of your files, while making it easy to access those files.
http://socghop.appspot.com/document/show/program/google/gsoc2009/faqs#timeli... I am still adding and fixing up the text on this page over the next day. Assuming Fedora is accepted as a mentoring organization, it seems we have until April 3rd to accept applications for Google SoC, then there is a review period after that where we decide who to accept as the official students.
We are not waiting for Google SoC to begin this project. Student participation in the project, shaping the prototyping phase and showing community leadership prior to April 3rd will prove that an individual is capable of being official SoC student(s).
How can you Help?
I believe this project is plenty interesting and useful to our future that other volunteers would also be interested in getting involved. Currently im-devel-list is being created to focus development discussion. We also could use more knowledgeable mentors to aid the student(s) in this project.
Warren Togami wtogami@redhat.com
On Thu, Mar 12, 2009 at 8:58 PM, Warren Togami wtogami@redhat.com wrote:
[snip]
How can you Help?
I believe this project is plenty interesting and useful to our future that other volunteers would also be interested in getting involved. Currently im-devel-list is being created to focus development discussion. We also could use more knowledgeable mentors to aid the student(s) in this project.
This looks really interesting, I'd love to get involved (and as luck has it, would still be a student this summer too).
Since the mailing list is not actually up yet, would "im" not be too ambiguous a name? "imirror" or "inst-mirror" is probably more descriptive. Most of us would have auto-completing mail clients anyway (does mutt do it?)
Regards,
On Fri, Mar 13, 2009 at 6:28 AM, Warren Togami wtogami@redhat.com wrote:
https://fedorahosted.org/InstantMirror/
I have wrote down design notes of the new InstantMirror project. This project is meant to be fostered in Google Summer of Code 2009, with multiple Red Hat engineers as mentors, and possibly multiple students working together as a team. Students are to be judged by their ability to work with others as a team, their ability to implement the plan, and their ability to encourage participation from further volunteers interested in solving these problems.
I could be wrong but wasn't there a similar GSoC project last year ?
Sankarshan Mukhopadhyay wrote:
On Fri, Mar 13, 2009 at 6:28 AM, Warren Togami wtogami@redhat.com wrote:
https://fedorahosted.org/InstantMirror/
I have wrote down design notes of the new InstantMirror project. This project is meant to be fostered in Google Summer of Code 2009, with multiple Red Hat engineers as mentors, and possibly multiple students working together as a team. Students are to be judged by their ability to work with others as a team, their ability to implement the plan, and their ability to encourage participation from further volunteers interested in solving these problems.
I could be wrong but wasn't there a similar GSoC project last year ?
Last year was IntelligentMirror, which had misguided goals and ultimately is not very useful.
This design if implemented properly could possibly upgrade all Fedora mirrors and make them operate a lot more efficiently. It would also make it real easy to deploy your own mini-mirror to serve your own local network. For example, at home I personally would use InstantMirror with maybe 30GB of cache allocated, but pre-fetching turned off.
Warren Togami Jr. warren@togami.com
On Fri, 13 Mar 2009, Warren Togami wrote:
Sankarshan Mukhopadhyay wrote:
On Fri, Mar 13, 2009 at 6:28 AM, Warren Togami wtogami@redhat.com wrote:
https://fedorahosted.org/InstantMirror/
I have wrote down design notes of the new InstantMirror project. This project is meant to be fostered in Google Summer of Code 2009, with multiple Red Hat engineers as mentors, and possibly multiple students working together as a team. Students are to be judged by their ability to work with others as a team, their ability to implement the plan, and their ability to encourage participation from further volunteers interested in solving these problems.
I could be wrong but wasn't there a similar GSoC project last year ?
Last year was IntelligentMirror, which had misguided goals and ultimately is not very useful.
I disagree with this assessment. IntelligentMirror has turned into quite an interesting project. It's gone well beyond its original scope as a caching mechanism for packages. There's no point in being rude about it.
-sv
On Fri, Mar 13, 2009 at 11:10 AM, Seth Vidal skvidal@fedoraproject.org wrote:
I disagree with this assessment. IntelligentMirror has turned into quite an interesting project. It's gone well beyond its original scope as a caching mechanism for packages. There's no point in being rude about it.
Which would then bring up the question - are the two projects (existing and proposed_new) diverging or, can the features/UseCase converge ?
On Fri, Mar 13, 2009 at 11:27 AM, Sankarshan Mukhopadhyay foss.mailinglists@gmail.com wrote:
On Fri, Mar 13, 2009 at 11:10 AM, Seth Vidal skvidal@fedoraproject.org wrote:
I disagree with this assessment. IntelligentMirror has turned into quite an interesting project. It's gone well beyond its original scope as a caching mechanism for packages. There's no point in being rude about it.
Which would then bring up the question - are the two projects (existing and proposed_new) diverging or, can the features/UseCase converge ?
And, I ask because, although I have not used IntelligentMirror, I read about it off Planet Fedora and, the development seems to be regular.
On Fri, Mar 13, 2009 at 12:40 AM, Seth Vidal skvidal@fedoraproject.org wrote:
On Fri, 13 Mar 2009, Warren Togami wrote:
Sankarshan Mukhopadhyay wrote:
On Fri, Mar 13, 2009 at 6:28 AM, Warren Togami wtogami@redhat.com wrote:
https://fedorahosted.org/InstantMirror/
I have wrote down design notes of the new InstantMirror project. This project is meant to be fostered in Google Summer of Code 2009, with multiple Red Hat engineers as mentors, and possibly multiple students working together as a team. Students are to be judged by their ability to work with others as a team, their ability to implement the plan, and their ability to encourage participation from further volunteers interested in solving these problems.
I could be wrong but wasn't there a similar GSoC project last year ?
Last year was IntelligentMirror, which had misguided goals and ultimately is not very useful.
I disagree with this assessment. IntelligentMirror has turned into quite an interesting project. It's gone well beyond its original scope as a caching mechanism for packages. There's no point in being rude about it.
Wasn't the orignal idea his as well? I thought he was simply hateful of his own idea. Happens to me all the time.
Arthur Pemberton wrote:
I disagree with this assessment. IntelligentMirror has turned into quite an interesting project. It's gone well beyond its original scope as a caching mechanism for packages. There's no point in being rude about it.
Wasn't the orignal idea his as well? I thought he was simply hateful of his own idea. Happens to me all the time.
The original InstantMirror was an attempt to make a reverse proxy server suitable for a yum repository. I quickly decided that approach was a dead-end architecturally then got busy on other things.
https://fedorahosted.org/fedora-infrastructure/browser/scripts/proxy-mirror I ended up making a really simple squid-based solution that works just fine with standard squid.conf options. Various folks have been using this solution for HTTP-only caching mirrors in production for a while now, mostly for local area network mirrors in branch offices or homes.
The IntelligentMirror guy took that original idea then went off in his own direction. He asked me to be involved last year, but I was disinterested because his web page was confusing, and what I could figure out I believed to be the wrong direction. I fail to see how intelligentmirror's squid redirector plugin is any improvement over standard squid.conf options. Something positive came out of this though, it seems he learned skills and eventually implemented videocache, which is useful.
https://fedorahosted.org/InstantMirror/wiki/ExistingRepositoryReplicationMet... Folks have never been fully satisfied with squid HTTP-only reverse caching proxy. A full mirror wants real directories that can be served over multiple protocols or copied. I have thus been thinking about how to combine the benefits of the traditional rsync mirror with the robustness of an on-demand caching mirror.
https://fedorahosted.org/InstantMirror/ The new InstantMirror proposal solves these problems. Rik van Riel and I brainstormed these details back on November 26th, 2008, but I haven't had the time to write it all down until now. Several students have been bugging me in the last week to write this up for Google SoC 2009. Here it goes. If we can make this proposal work it will be an awesome step forward.
Warren Togami wtogami@redhat.com
Warren Togami wrote:
https://fedorahosted.org/fedora-infrastructure/browser/scripts/proxy-mirror I ended up making a really simple squid-based solution that works just fine with standard squid.conf options. Various folks have been using this solution for HTTP-only caching mirrors in production for a while now, mostly for local area network mirrors in branch offices or homes.
Oops, this was the wrong URL. This was pre-InstantMirror, prior to finding the squid.conf options that allow it work as a cache without issues with yum.
https://fedorahosted.org/InstantMirror/wiki/ExistingRepositoryReplicationMet... I wrote those options here.
Warren
On Thu, Mar 12, 2009 at 11:23 PM, Warren Togami wtogami@redhat.com wrote:
The original InstantMirror was an attempt to make a reverse proxy server suitable for a yum repository. I quickly decided that approach was a dead-end architecturally then got busy on other things.
The original InstantMirror certainly has its limitations, but in its defense I would point out that it's survived a couple of years of constant use as the default Fedora repo at my company, with zero maintenance (I'm not even sure where the machine hosting it has gone...).
(See http://www.redhat.com/archives/fedora-devel-list/2007-November/msg01699.html for some background on this tool.)
I see the old bzr repository is no more; I'd be happy to host the current version of the code (all 120 lines of it) elsewhere if anyone is interested in using or extending it. Perhaps I should rename it to avoid confusion with the new-and-improved InstantMirror?
--Ed
Ed Swierk wrote:
On Thu, Mar 12, 2009 at 11:23 PM, Warren Togami wtogami@redhat.com wrote:
The original InstantMirror was an attempt to make a reverse proxy server suitable for a yum repository. I quickly decided that approach was a dead-end architecturally then got busy on other things.
The original InstantMirror certainly has its limitations, but in its defense I would point out that it's survived a couple of years of constant use as the default Fedora repo at my company, with zero maintenance (I'm not even sure where the machine hosting it has gone...).
Wow, you are using it despite the lack of cleanup? You don't run out of disk space?
Do you actually make use of the directory and filenames where it stores the files directly? If not, then you will find that a reverse squid proxy cache works great because it cleans up after itself.
(See http://www.redhat.com/archives/fedora-devel-list/2007-November/msg01699.html for some background on this tool.)
I see the old bzr repository is no more; I'd be happy to host the current version of the code (all 120 lines of it) elsewhere if anyone is interested in using or extending it. Perhaps I should rename it to avoid confusion with the new-and-improved InstantMirror?
If you want to continue development of it, that would be a good idea. Sorry I didn't think to ask if you objected to reusing the name. I thought the project was fully dead.
Warren Togami wtogami@redhat.com
On Wed, March 18, 2009 8:53 pm, Warren Togami wrote:
Ed Swierk wrote:
On Thu, Mar 12, 2009 at 11:23 PM, Warren Togami wtogami@redhat.com wrote: The original InstantMirror certainly has its limitations, but in its defense I would point out that it's survived a couple of years of constant use as the default Fedora repo at my company, with zero maintenance (I'm not even sure where the machine hosting it has gone...).
Wow, you are using it despite the lack of cleanup? You don't run out of disk space?
I use it to. Wrote a simple script that uses rsync to cleanup. Only need to run it a couple times a year. Disk space is cheap, and you really don't need that much when you no longer mirror all of the game data you don't need.
Do you actually make use of the directory and filenames where it stores the files directly? If not, then you will find that a reverse squid proxy cache works great because it cleans up after itself.
I do use it quite a bit. Not strictly necessary, but convenient. I also like having the history and being able to go back to old updates packages for debugging.
I see the old bzr repository is no more; I'd be happy to host the current version of the code (all 120 lines of it) elsewhere if anyone is interested in using or extending it. Perhaps I should rename it to avoid confusion with the new-and-improved InstantMirror?
If you want to continue development of it, that would be a good idea. Sorry I didn't think to ask if you objected to reusing the name. I thought the project was fully dead.
Might be interesting to compare my current version to it as well.
Orion Poplawski wrote:
On Wed, March 18, 2009 8:53 pm, Warren Togami wrote:
Ed Swierk wrote:
On Thu, Mar 12, 2009 at 11:23 PM, Warren Togami wtogami@redhat.com wrote: The original InstantMirror certainly has its limitations, but in its defense I would point out that it's survived a couple of years of constant use as the default Fedora repo at my company, with zero maintenance (I'm not even sure where the machine hosting it has gone...).
Wow, you are using it despite the lack of cleanup? You don't run out of disk space?
I use it to. Wrote a simple script that uses rsync to cleanup. Only need to run it a couple times a year. Disk space is cheap, and you really don't need that much when you no longer mirror all of the game data you don't need.
The lack of cleanup was only one problem. Another was it did not handle multiple users using it simultaneously.
Do you actually make use of the directory and filenames where it stores the files directly? If not, then you will find that a reverse squid proxy cache works great because it cleans up after itself.
I do use it quite a bit. Not strictly necessary, but convenient. I also like having the history and being able to go back to old updates packages for debugging.
http://kojipkgs.fedoraproject.org/packages/packagename You know you can grab old versions of packages from here?
Warren
has anybody thought about a no configuration mirror selection by broadcasting on the lan, something like upnp/zeroconf/dhcp and getting a mirror list from a process small enough to run on a openwrt type router.
On Wed, Mar 18, 2009 at 8:16 PM, Warren Togami wtogami@redhat.com wrote:
The lack of cleanup was only one problem. Another was it did not handle multiple users using it simultaneously.
It does handle multiple simultaneous users, although if the first one to access a file happens to be really slow, it will throttle other users downloading the same file, and if the first connection dies, it takes the other ones down with it. This isn't an issue once the complete file is stored in the mirror, just when fetching it from upstream.
In practice, though, I've never encountered a problem with multiple simultaneous users. I suspect yum's tenacious retry mechanism has masked any glitches caused by InstantMirror.
--Ed
Warren Togami wrote:
The lack of cleanup was only one problem. Another was it did not handle multiple users using it simultaneously.
Well, it's inefficient with multiple requests for the same non-cached file (multiple downloads).
http://kojipkgs.fedoraproject.org/packages/packagename You know you can grab old versions of packages from here?
Vaguely. But again, disk is cheap and they are already local.
I'm not saying that I don't want a better InstantMirror, just that this one is "usable".
On Wed, Mar 18, 2009 at 7:53 PM, Warren Togami wtogami@redhat.com wrote:
Wow, you are using it despite the lack of cleanup? You don't run out of disk space?
Nope. So far it's using about 350GB on a 750GB disk. We've been using FC6, F9 and F10 on i386 and x86_64, plus updates.
Do you actually make use of the directory and filenames where it stores the files directly? If not, then you will find that a reverse squid proxy cache works great because it cleans up after itself.
Not really, but I like being able to manage the data store manually if I wanted to. When the day finally comes that the 750GB is exhausted, I'll ssh to the mirrors box and type a couple of rm -rf's to free up some space and then forget about it for another year.
If you want to continue development of it, that would be a good idea. Sorry I didn't think to ask if you objected to reusing the name. I thought the project was fully dead.
I don't plan to hack on the code much myself, as it already works just fine for me. But I'm happy to host the code somewhere and help integrate patches if other people come up with improvements.
IIRC you came up with the name InstantMirror; you're welcome to reuse it.
--Ed
Seth Vidal wrote:
I disagree with this assessment. IntelligentMirror has turned into quite an interesting project. It's gone well beyond its original scope as a caching mechanism for packages. There's no point in being rude about it.
https://fedorahosted.org/intelligentmirror/ However looking at its home page, it is equally convoluted and self-contradictory as when I read it last year. The page fails to describe what is actually contained in the code, and instead contains lots of irrelevant or wrong details, or plans that didn't happen.
Looking at the code I can see the direction it took, which I was uninterested in being involved in last year. It looks good that the author learned skills and went on to implement something more useful in videocache.
Warren
2009/3/13 Warren Togami wtogami@redhat.com
Sankarshan Mukhopadhyay wrote:
On Fri, Mar 13, 2009 at 6:28 AM, Warren Togami wtogami@redhat.com wrote:
https://fedorahosted.org/InstantMirror/
I have wrote down design notes of the new InstantMirror project. This project is meant to be fostered in Google Summer of Code 2009, with multiple Red Hat engineers as mentors, and possibly multiple students working together as a team. Students are to be judged by their ability to work with others as a team, their ability to implement the plan, and their ability to encourage participation from further volunteers interested in solving these problems.
I could be wrong but wasn't there a similar GSoC project last year ?
Last year was IntelligentMirror, which had misguided goals and ultimately is not very useful.
It was _not_ misguided. It was designed for the people who can't afford replication of complete mirror because they pay a fortune for the bandwidth they use. It forced squid to cache only the packages which are used in an intelligent fashion. So that saves a lot of bandwidth which is otherwise wasted by other softwares to keep packages upto date which are never used within an organization (due to community interest).
IntelligentMirror also solved the problem of "squid doesn't serve a package XYZ from cache when it is fetched from a different mirror." which no other plugin/software, I know, solves till date.
This design if implemented properly could possibly upgrade all Fedora mirrors and make them operate a lot more efficiently. It would also make it real easy to deploy your own mini-mirror to serve your own local network. For example, at home I personally would use InstantMirror with maybe 30GB of cache allocated, but pre-fetching turned off.
Warren Togami Jr. warren@togami.com
-- fedora-devel-list mailing list fedora-devel-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-devel-list
-- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
--
------------------------------------------------------- Thank you, Kulbir Saini, Computer Science and Engineering, International Institute of Information Technology, Hyderbad, India - 500032.
My Home-Page: http://saini.co.in/ My Linux-Blog: http://fedora.co.in/ My Open-Source Project : http://cachevideos.com/ -------------------------------------------------------
Kulbir Saini wrote:
It was _not_ misguided. It was designed for the people who can't
afford replication of complete mirror because they pay a fortune for the bandwidth they use. It forced squid to cache only the packages which are used in an intelligent fashion. So that saves a lot of bandwidth which is otherwise wasted by other softwares to keep packages upto date which are never used within an organization (due to community interest).
IntelligentMirror also solved the problem of "squid doesn't serve a
package XYZ from cache when it is fetched from a different mirror." which no other plugin/software, I know, solves till date.
I really don't mean to belabor this tired topic any further, but you are wrong.
Squid can do this without a plugin with only a few squid.conf options that avoid the corner-case mismatch errors when files change content without changing filenames, as is common with repodata or when RPMS are re-signed.
squid.conf options: refresh_pattern repodata/.*$ 0 0% 0 refresh_pattern images/.*$ 0 0% 0 refresh_pattern .*rpm$ 0 0% 0
https://admin.fedoraproject.org/mirrormanager Then you add a Site-Local net range to MirrorManager so yum clients on your network will automatically prefer a particular mirror. This is either a reverse squid proxy cache running on your own network, or a particular public mirror URL + transparent squid proxy. Either will work just fine with the above 3 lines of squid.conf. This is especially a good idea because you do not need to modify any configurations on individual clients. If you have a Fedora laptop on this network, it will use the cached mirror. If you move your laptop to another network, MirrorManager will tell it where to find a different mirror.
Warren Togami wtogami@redhat.com
Warren Togami wtogami@redhat.com writes:
This design if implemented properly could possibly upgrade all Fedora mirrors and make them operate a lot more efficiently. It would also make it real easy to deploy your own mini-mirror to serve your own local network. For example, at home I personally would use InstantMirror with maybe 30GB of cache allocated, but pre-fetching turned off.
InstantMirror looks like a brilliant idea. If some people can't wait for it to finish, there is also pkg-cacher. It has a much smaller scope, but it is very useful if you have a bunch of machines and you don't want to fetch the same package dozens of times.
/Benny
Nice. Can it have it today, with a pony please?
On the issue of design, are you familiar with "wandering trees"? For read-only filesystems where you need to keep snapshots, have transactions, and ensure that new versions are consistent, they are very well-suited.
Rich.
Warren Togami wrote:
https://www.redhat.com/mailman/listinfo/instantmirror-list For questions or comments about this project, please use instantmirror-list.
Warren Togami wtogami@redhat.com
devel@lists.stg.fedoraproject.org