One of the things that we're looking at for Fedora 8 is how to integrate deltarpms and the presto yum plugin so that we can reduce the amount that our users have to download. To make this happen, though, is going to require some work on the side of the build system and compose tools so that we can generate these without any real problems. Based on the discussion Jesse and I were having driving home yesterday, we came to the following as a proposal. (Note: any errors here are probably mine since I transcribed it a while after the conversation).
The first thing is that it makes the most sense for the deltas to be created and stored by koji rather than as a secondary process. This adds the advantage that they're stored consistently with the packages and also can be cached rather than recreated every time. It feels somewhat analagous to me to the current situation with signing.
To handle the case of deltas for updates, we can have a step either happen pre-mash or early in mash to call out to koji to create deltas. We'll want to be creating the delta for dist-fX-GA -> the update as well as the latest package in dist-FX-updates and including them. We then just need to generate the delta metadata and do modifyrepo much like we'll be doing for the updateinfo.
Updates-testing can basically work identically to updates.
Generating deltas for rawhide is a little trickier. The idea we got to was that we'd change dist-rawhide to not inherit and instead be tagging packages with it. Then, for the packages we tag, we can also generate a delta from the last dist-rawhide version to the new version. Then mash can pull off of the dist-rawhide tag (like now) and use deltas much like the update case
What do people think? Entirely crazy? Just a little bit crazy? Has holes big enough to drive a truck through?
Jeremy
On Thu, 21 Jun 2007, Jeremy Katz wrote:
What do people think? Entirely crazy? Just a little bit crazy? Has holes big enough to drive a truck through?
We discussed this on the Red Hat Network team, like, 4 years ago. Our conclusion was that it was too crazy. I still regret that decision.
Get deltarpm started, and when it breaks (and it will, horribly) fix it. We've needed it for a very long time.
--g
On Thursday 21 June 2007 17:12:11 Bill Nottingham wrote:
This sounds ridiculously painful, if we're not doing this automatically.
Oh no, it'll be automatic. The idea is that at the start of the 'push rawhide' script a quick call to say 'koji clone-tag <sourcetag> dist-rawhide' would quickly go through and tag all the newest builds of <sourcetag> with dist-rawhide, that aren't already tagged.
This actually has some other benefits in that we can tell what has or hasn't been shipped in rawhide for garbage collection purposes.
On Thu, 2007-06-21 at 17:16 -0400, Jesse Keating wrote:
On Thursday 21 June 2007 17:12:11 Bill Nottingham wrote:
This sounds ridiculously painful, if we're not doing this automatically.
Oh no, it'll be automatic. The idea is that at the start of the 'push rawhide' script a quick call to say 'koji clone-tag <sourcetag> dist-rawhide' would quickly go through and tag all the newest builds of <sourcetag> with dist-rawhide, that aren't already tagged.
+1 for the clone-tag functionality.
rob.
Jeremy Katz wrote:
The first thing is that it makes the most sense for the deltas to be created and stored by koji rather than as a secondary process. This adds the advantage that they're stored consistently with the packages and also can be cached rather than recreated every time. It feels somewhat analagous to me to the current situation with signing.
While I see the semi-parallel with signatures, I'd rather not rush into adding this to koji. I'd like to have a better understanding of how these deltas need to be managed.
Do we anticipate Koji actually having a use for the deltas, or would it just be storing them for other tools?
Can deltarpms be signed independently of the rpms it compares? If so, we may need to think about tracking these signatures.
How should we deal with the delta/signature interaction? Is there a quick way to read the target's signature info from the delta (applydelta -i doesn't seem to report it)? Each rpm in koji can have multiple signatures, and we would presumably care which signature will be used for the target rpm in the delta. This leads me to wonder about naming schemes and api needs.
With signatures, the cached files are tiny, there are unlikely to be more than a handful of them per rpm, and it is clearly reasonable to keep them for as long as the rpm is kept. Even deltas for trivial cases seem to be much larger than a cached signature header, and one can imagine accumulating a large number of deltas for an rpm. So the question is, how long should deltas be kept, and what should trigger their removal?
On Fri, 2007-06-22 at 16:00 -0400, Mike McLean wrote:
Jeremy Katz wrote:
The first thing is that it makes the most sense for the deltas to be created and stored by koji rather than as a secondary process. This adds the advantage that they're stored consistently with the packages and also can be cached rather than recreated every time. It feels somewhat analagous to me to the current situation with signing.
While I see the semi-parallel with signatures, I'd rather not rush into adding this to koji. I'd like to have a better understanding of how these deltas need to be managed.
Fair enough...
Do we anticipate Koji actually having a use for the deltas, or would it just be storing them for other tools?
I suspect largely storage. Since we recreate build roots every time, the deltas aren't ever going to matter on the building packages side. So the main advantage of doing it in koji is consistent storage and retrieval.
Can deltarpms be signed independently of the rpms it compares? If so, we may need to think about tracking these signatures.
The deltas aren't independently signed. The way that the deltas work is that they take the bits off of the filesystem + the deltarpm itself to recreate the original RPM. You then have the original package, and you verify it (including signature).
How should we deal with the delta/signature interaction? Is there a quick way to read the target's signature info from the delta (applydelta -i doesn't seem to report it)? Each rpm in koji can have multiple signatures, and we would presumably care which signature will be used for the target rpm in the delta. This leads me to wonder about naming schemes and api needs.
Yeah, given this we probably are going to want to be able to have multiple deltas taking into account the signatures. Although that just makes me cringe a little in pain...
With signatures, the cached files are tiny, there are unlikely to be more than a handful of them per rpm, and it is clearly reasonable to keep them for as long as the rpm is kept. Even deltas for trivial cases seem to be much larger than a cached signature header, and one can imagine accumulating a large number of deltas for an rpm. So the question is, how long should deltas be kept, and what should trigger their removal?
Removal should be triggered the same way as removal of the package -- I don't think you'll want to do separate garbage collection there. And yes, larger than the signature. But I don't see any way around that. Generating them on-demand is going to be worse from the point of view of regenerating the same bits over and over. We've got to keep the generated ones somewhere. Keeping them outside of the buildsystem means we have another data base, another file store, etc.
Jeremy
buildsys@lists.fedoraproject.org