On Tue, Jul 25, 2017 at 12:12:30PM -0400, Dennis Gregorovic wrote:
On 07/25/2017 10:59 AM, Paul W. Frields wrote:
I'd meant to raise this question last week but it turned out several folks were out of pocket who'd probably want to discuss. One of the aspects of continuous integration[1] that impacts my team is the storage requirement. How much storage is required for keeping test results, composed trees and ostrees, and other artifacts? What is their retention policy?
A policy of "keep everything ever made, forever" clearly isn't scalable. We don't do that in the non-CI realm either, e.g. with scratch builds. I do think that we must retain everything we officially ship, that's well understood. But atop that, anything we keep costs storage, and over time this storage costs money. So we need to draw some reasonable line that balances thrift and service.
A. Retention
The second question is probably a good one to start with, so we can answer the first. So we need to answer the retention question for some combination of:
- candidate builds that fail a CI pipeline
- candidate builds that pass a CI pipeline
- CI composed testables
- a tree, ISO, AMI, other image, etc. that's a unit
- ostree change which is more like a delta (AIUI)
- CI generated logs
- ...other stuff I may be forgetting
The other big bucket is packages in the buildroot used to build the builds. You may want to keep these as well if there is a desire to be able to rebuild packages at a later point.
Thanks for that. My bet is we'd want to set the retention dial for those no higher than "when that release goes EOL."
My general thoughts are that these things are kept forever:
- (2), but only if that build is promoted as an update or as part of a shipped tree/ostree/image
- (3), but only if the output is shipped to users
- (4), but only if corresponding to an item in (2) or (3)
Outside that, artifacts and logs are kept only for a reasonable amount of troubleshooting time. Say 30 days, but I'm not too worried about the actual time period. It could be adjusted based on factors we have yet to encounter.
How does this proposal compare the existing practice in Fedora?
My initial guess is, pretty well in concept -- but that the *practice* is that we've not been very aggressive about trimming ancient shipped stuff. How many people are liable to seek Fedora <= 18 releases at this point, for example?