Hi, folks! I thought this might be about the appropriate time to throw this out there.
There hasn't been a big news press on this, but some of you may know that releng is fairly close to switching over to Pungi 4 for composes. For those of you who don't know:
releng is fairly close to switching over to Pungi 4 for composes.
This will have various interesting effects on QA and the whole process of building Fedora releases.
With the current releng process, TC / RC composes are one beast, and nightly composes are another, very different beast. In fact nightly composes barely really 'exist' at all - when we say 'nightly compose' we really mean 'pungify the rawhide/branched repo, and fire off a bunch of koji tasks'. After the fact, there is no real relationship between any of those bits, which is why I had to write fedfind to go out and synthesize the concept of a 'nightly compose' by finding all the Koji tasks and treating them plus the repository boot.iso's as a single 'compose'.
With Pungi 4, all composes will look a lot more similar. 'nightly' composes (which, in point of fact, will probably happen more than once per day - I'm not sure if we came up with a new name yet) look a lot more like current TC/RC composes than current nightly composes. You can see approximately what a Pungi 4 compose currently looks like here:
https://kojipkgs.fedoraproject.org/compose/rawhide/
as of right now, the Koji built bits - lives, cloud and ARM disk images, etc - aren't integrated with the installer images, but they *will* be, and they'll all show up in the same location. As you can see it has all the different variants, and a Server DVD image. (A Pungi 4 compose also has a bunch of metadata, which means we can more or less kill off fedfind, thank God).
The implication of this I wanted to talk about in this thread is: what does this mean for the release validation process, in terms of what composes we cut and what release validation events we have?
So as you probably know, right now, the validation process is built around the milestone 'TC' and 'RC' images. We build a series of Alpha TCs and run a bunch of tests for each of these composes, reporting the results to wiki pages named for the composes. Then we do Alpha RCs, then Beta TCs, and so on through Final RCs.
For the last few releases we've added on some 'nightly' validation events, where we create wiki pages named for nightly composes and run the same set of tests on the nightly boot.iso's and Koji images, but these have been framed as kind of an 'early warning system' for use before Alpha TC1 arrives, and once Alpha TC1 arrives we stop doing the nightly validation events.
With Pungi 4, I don't think this makes a lot of sense any more. Dennis and I have been talking about this and I think we broadly agree on it.
TCs and RCs used to be kinda the only way we *could* do validation testing. For long periods we didn't have reliable nightly builds of Rawhide or Branched at all, certainly not all the Koji-produced images. The process for doing 'real' composes was quite long and painful and required squishy human intervention.
If we have automated, more-than-nightly composes that look much like a regular release compose would, there's no clear case for having TCs at all. We could simply stop building them and extend the "nightly" validation process. I think the way to do that would be to keep 'nominating' nightly composes for validation testing all the time, *except* when we're doing RCs. So instead of going something like:
24 Rawhide 20160120 24 Rawhide 20160215 == BRANCH POINT == 24 Branched 20160301 24 Branched 20160315 24 Alpha TC1 24 Alpha TC2 == ALPHA FREEZE == 24 Alpha RC1 24 Alpha RC2 == ALPHA RELEASE == 24 Beta TC1 ....
we'd go something like:
24 Rawhide 20160120 24 Rawhide 20160215 == BRANCH POINT == 24 Branched 20160301 24 Branched 20160315 24 Branched 20160401 24 Alpha RC1 24 Alpha RC2 == ALPHA RELEASE == 24 Branched 20160501 24 Branched 20160515 24 Beta RC1 ....
note: all dates completely made up, this is just for illustration.
I think it would be plausible to do this for Fedora 24, if the Pungi 4 switchover happens soon and goes well. There would be some details to pin down in relval and wikitcms and stuff (we might need to tweak the validation event naming approach a bit so that it's possible to identify the sequence of events from the names - i.e. so you know where the RCs fit in), but nothing unsolvable.
We'll be talking about a lot of this stuff at DevConf, if anyone's going to be there, pin down me or Dennis or someone else involved in release-y stuff and we'd be happy to discuss it. But I wanted to throw something up on the lists for discussion as well. What do you think? Thanks!
One point that's come up already is the way that we manually pull newer packages to fix blocker/FE bugs into TC and RC composes via the 'bleed' repo. We're currently envisaging something like the 'buildroot override' mechanism for the compose process - some kind of system which would tag packages to be pulled into the composes somehow. It would still be gated through the blocker/FE review process at least during freezes, and probably all the time (it wouldn't be open season for any packager to request a 'compose override' at any time). This would also allow us to do stuff like 'tag new anaconda builds into the composes as soon as they land in updates-testing, so we can actually test them and provide karma'.
If we have automated, more-than-nightly composes that look much like a regular release compose would, there's no clear case for having TCs at all. We could simply stop building them and extend the "nightly" validation process. I think the way to do that would be to keep 'nominating' nightly composes for validation testing all the time, *except* when we're doing RCs. So instead of going something like:
24 Rawhide 20160120 24 Rawhide 20160215 == BRANCH POINT == 24 Branched 20160301 24 Branched 20160315 24 Alpha TC1 24 Alpha TC2 == ALPHA FREEZE == 24 Alpha RC1 24 Alpha RC2 == ALPHA RELEASE == 24 Beta TC1 ....
we'd go something like:
24 Rawhide 20160120 24 Rawhide 20160215 == BRANCH POINT == 24 Branched 20160301 24 Branched 20160315 24 Branched 20160401 24 Alpha RC1 24 Alpha RC2 == ALPHA RELEASE == 24 Branched 20160501 24 Branched 20160515 24 Beta RC1 ....
Here's a question. Are we going to "nominate" only those composes in which a substantial component changed (i.e. anaconda or systemd), similarly to what we do now in rawhide, or are we going to nominate each new compose (i.e. one or more per day)? The first approach seems simpler for humans, but I can't imagine how we make it work for e.g. Desktop matrices - there's so many components in there that we would probably end up nominating every day anyway. The second approach means we would let automation do its job and humans would have to rely mainly on testcase_stats to see which test cases were recently tested and which were not, and test according to that. I think the second approach is something that we should aim for in the future, but I'm not sure we're there yet. It will certainly require some larger changes in testcase_stats to make sure they correctly represent everything (now that we'll rely solely on that), e.g. not squashing different test environments together into a single result, etc.
On Fri, 2016-01-29 at 09:26 -0500, Kamil Paral wrote:
Here's a question. Are we going to "nominate" only those composes in which a substantial component changed (i.e. anaconda or systemd), similarly to what we do now in rawhide, or are we going to nominate each new compose (i.e. one or more per day)?
That's definitely something to consider, yeah. It's logic that's quite easy to tweak.
The first approach seems simpler for humans, but I can't imagine how we make it work for e.g. Desktop matrices - there's so many components in there that we would probably end up nominating every day anyway.
Well, I intentionally never tried to extend the list of 'significant packages' to every single one which could *possibly* cause anaconda's behaviour to change, and I wouldn't suggest it would make sense to do that for GNOME either. Really it just seemed like a neat way of regulating the flow of nominated composes. Note the mechanism is a bit more complex than you mentioned, there are a pair of time constraints: it *always* waits at least three days between nominations, and if two weeks go by without a 'significant' package change it'll go ahead and nominate anyway (that may have kicked in once :>).
The second approach means we would let automation do its job and humans would have to rely mainly on testcase_stats to see which test cases were recently tested and which were not, and test according to that. I think the second approach is something that we should aim for in the future, but I'm not sure we're there yet. It will certainly require some larger changes in testcase_stats to make sure they correctly represent everything (now that we'll rely solely on that), e.g. not squashing different test environments together into a single result, etc.
This is broadly my take, yeah. Honestly, I think it might be time to go back into the test framework jungle, though we might actually wind up in the dreaded 'build our own' position this time. I've been vaguely thinking about a system to consolidate automated and manual test results into resultsdb. So we'd have something that would submit results from autocloud and openQA to resultsdb, and we'd build some kind of client (webapp or whatever) for submission of manual test results, and displaying all the combined results from automated test systems and manual testers.
In my mind this system doesn't actually store or display test cases; they stay in the wiki. Each test case has a permanent ID and a changeable URL, so we can rename test cases where appropriate. The new bits would simply link out to the wiki where appropriate.
It's still just a concept for now, but that's kinda where my mind's going...WDYT? Do you see more mileage in extending testcase_stats?