Hey Guys,
I just committed the first bits to the new updates system[0]. At the moment it doesn't do much, but I defined an initial database model (which will also help us see how to integrate this with the package db) and a couple of controllers.
I also updated the UpdatesSystem wiki[1] page with screenshots of the current system that is used to push out core package updates. Hopefully this will give people some context as to the direction this project is going in.
The code should be pretty well commented, especially in places that *need* code.
This project is going to become a top priority for me in about week, after finals. I'll try and produce a list of tasks at some point in the near future that people can just pick up and work on. But in the mean time, if you would like to help out, I recommend reading over the wiki and getting familiar with the current update process/system. From there, checkout the code, `yum install TurboGears` and start playing around with it.
luke
[0]: http://cvs.fedoraproject.org/viewcvs/fedora-updates-system/?root=fedora [1]: http://fedoraproject.org/wiki/Infrastructure/UpdatesSystem
On Sun, 12 Nov 2006 16:42:20 -0500, Luke Macken wrote:
I'd like to take the opportunity for a few comments in no particular order:
repoview --------
Running it for all of Fedora Extras takes a lot of time. So long, that even if you let the push script do its work in a background terminal, it is still painful to see how long it takes to complete.
For unknown reasons we also create repoview pages for the "debug" repositories. If it were just my own decision, I would stop doing that, since I doubt those web pages are popular enough. Who really browses repoview for debuginfo packages? We should expect debuginfo packages to be available for every relevant package in the repository.
createrepo < 0.4.5 is unable to handle "unknown" files in its repodata directory. Therefore it conflicts with repoview and runs into a fatal error condition with a premature program termination, leaving behind a temporary ".olddir". This is especially ugly, since an administrator would need to recover from that manually and either move back files from ".olddir" or delete it. But when deleted, the repoview tree is lost and is created from scratch. I've been told that this is a problem for mirrors, where the several thousands of files are re-examined for changes just because of the fresh time stamps. So, as of a few weeks ago, the push script works around that successfully with a repodata backup strategy outside of createrepo.
For createrepo and repoview to be run in the background as a scheduled job, it needs a local lock on the repository. Most likely _not_ fine-grained locking on every arch-specific sub-repository, because every locking comes at a cost (especially when there are multiple jobs of different priority waiting).
repoclosure -----------
Running this takes even more time. Currently, it examine all of Fedora Extras + Core + Updates + Legacy in a background job after packages have been published. The time it takes is approximately the difference between the time stamps of the build report and the broken deps report.
[...]
''' Pushing '''
- Moves packages to proper updates stage
More "stages" which are understood by plague would be good. We only use one stage, needsign, which is the build-results repository known to the build servers.
Fedora Extras had started with a small collection of sh/py scripts for signing and moving rpms from plague's build-results directory into the repository. Among the reoccurring problems, which lead to some of the development on the push script(s):
- pulling away built rpms from under plague's feet
Initially (long ago) we signed rpms directly in the needsign repository and moved the packages into the local master repository. Due to that, they became unavailable to the build servers until they appeared on the public master repository. Particularly trouble-some, since we push to RDU, and they are synced from there to RH, which is not an immediate operation.
- permission problems
Even with a shared gid and umask, there are remaining problems, such as explicit directory mode 0755 in yum backend code or Python modules.
- disk space constraints
No longer an issue since the larger hdds were installed. But it required working with temporary directories for signing and publishing packages in order to avoid breakage in the middle of push. It's not trivial to recover from that without the help of a database or transaction state information.
- updates repo cleaner
- remove old packages
So far, the script I named repoprune is much faster than repomanage and simplifies Fedora Extras repository maintenance a lot, since it gets rid of orphaned and out-of-date sub-packages automatically, too.
On Fri, Nov 24, 2006 at 11:09:58AM +0100, Michael Schwendt wrote: [...]
repoview
Running it for all of Fedora Extras takes a lot of time. So long, that even if you let the push script do its work in a background terminal, it is still painful to see how long it takes to complete.
For unknown reasons we also create repoview pages for the "debug" repositories. If it were just my own decision, I would stop doing that, since I doubt those web pages are popular enough. Who really browses repoview for debuginfo packages? We should expect debuginfo packages to be available for every relevant package in the repository.
Yeah, repoview for debuginfo packages seems unnecessary. We don't even run repoview for updates{,-testing} at the moment, but we can easily integrate it with the new system if we want.
createrepo < 0.4.5 is unable to handle "unknown" files in its repodata directory. Therefore it conflicts with repoview and runs into a fatal error condition with a premature program termination, leaving behind a temporary ".olddir". This is especially ugly, since an administrator would need to recover from that manually and either move back files from ".olddir" or delete it. But when deleted, the repoview tree is lost and is created from scratch. I've been told that this is a problem for mirrors, where the several thousands of files are re-examined for changes just because of the fresh time stamps. So, as of a few weeks ago, the push script works around that successfully with a repodata backup strategy outside of createrepo.
The old updates system code (and the fedora-updates-clean cronjob) has hacks around this, as well, when dealing with the updateinfo.xml.gz. I haven't checked, but how does createrepo >= 0.4.5 deal with unknown files now?
For createrepo and repoview to be run in the background as a scheduled job, it needs a local lock on the repository. Most likely _not_ fine-grained locking on every arch-specific sub-repository, because every locking comes at a cost (especially when there are multiple jobs of different priority waiting).
Yeah, there definitely needs to be mutual exclusion with the updates staging repository.
repoclosure
Running this takes even more time. Currently, it examine all of Fedora Extras + Core + Updates + Legacy in a background job after packages have been published. The time it takes is approximately the difference between the time stamps of the build report and the broken deps report.
I'd like to run repoclosure on the updates before pushing them out at all (sure we have updates-testing, but I don't want that becoming another rawhide).
I spoke with skvidal over the summer speeding up the closure process, and seem to remember mention of caching the provides/requires to gain some speed. We'll definitely have to look into this, as this process is indeed dreadfully slow.
''' Pushing '''
- Moves packages to proper updates stage
More "stages" which are understood by plague would be good. We only use one stage, needsign, which is the build-results repository known to the build servers.
Well, once we get the Brew situation figured out, we can use its detached gpg signatures to help with the push process. Whatever stages we want (pending/testing/needsign/pushed) should probably be implemented in the updates system itself, to allow the buildsystem to only worry about spitting out builds. Althought, my Brew-fu is weak, so we'll have to see what it offers once/if it is released.
[..]
- updates repo cleaner
- remove old packages
So far, the script I named repoprune is much faster than repomanage and simplifies Fedora Extras repository maintenance a lot, since it gets rid of orphaned and out-of-date sub-packages automatically, too.
Awesome, I will see what I can do about integrating this tool into the new system.
luke
On Mon, 11 Dec 2006 23:45:16 -0500, Luke Macken wrote:
createrepo < 0.4.5
The old updates system code (and the fedora-updates-clean cronjob) has hacks around this, as well, when dealing with the updateinfo.xml.gz. I haven't checked, but how does createrepo >= 0.4.5 deal with unknown files now?
It moves old files and restores them when it's done. Still, with a CLI push-script, something like Ctrl+C would interrupt createrepo in an unwanted way and require manual cleanup. Hence a backup/restore mechanism outside of it is good.
It seems createrepo 0.4.6 from 11-Aug-2006 is the only version after 0.4.4 (FC6).
Hey guys,
I updated the UpdatesSystem[0] wiki last night with a bunch of tasks that need to get done, and a couple of wishlist items. For those looking to help out, feel free to grab anything on there (even if my name is on it), or add something you would like to see, and start hacking. Don't hesitate to speak up on IRC or onlist with any question/comments/concerns.
Development has been going fairly smooth (except for a fun little exception thrown after trying to bump SQLObject that I still need to investigate). Since the buildsystem stuff is currently in the air, I setup the development environment to use a LocalTest Buildsystem (see buildsys.py), which just points to a `pkg/ver/rel/arch` hierarchy for pulling in built updates.
Once I polish up the push code, I'm going to write a bunch of test cases and commit my test-build tree as well, with a couple of RPMS to be able to run the tests and be able to push updates 'out of the box'.
I'm not really in a rush to get this running on publictest2 at the moment, because you can get full functionality by checking out the code locally and running `./start-updatesssystem`. However, I will be giving publictest2 some love in the near future, as I plan on reading the 'Deployment' chapter in my TurboGears book very soon.
luke
[0]: http://fedoraproject.org/wiki/Infrastructure/UpdatesSystem
infrastructure@lists.fedoraproject.org