A Glorious Vision of Our Shared Update Feedback Future (bodhi, karma, and proventesters, oh my) - test - Fedora mailing-lists

22 Nov 2011


      Hey, folks.
So in the recent proven tester discussion, and in various other threads,
I've oft stated that the limits of the current Bodhi karma system are a
significant problem, and the planned Bodhi 2.0 karma system has to
potentially to significantly improve our update testing process. But it
occurred to me that I haven't really laid out why in much detail, and
those who aren't as involved with the process as we in QA are might not
have a really clear vision of why this is so important.
So, I thought I'd lay it out in the form of a glorious vision of the
future! Note: I have zero UI design skills. This UI as described would
suck. But the idea is that there would be a decent UI which *represents
all the described choices*.
In the Great Bodhi Of The Future, for any given update, a tester will
not simply have a comment box and a drop-down for -1, 0 or +1. They will
see:
* The list of test cases associated with the package, with a PASS / FAIL
choice for each
* Checkboxes for 'This update does / does not break critical path
functionality' (with a link out to the critpath definition)
* A checkbox for 'I installed this update and continued to use my system
as normal and did not note any regressions'
* Any custom choices the package maintainer opts to provide, via some
kind of interface to Bodhi
This is the kind of flexibility that would make karma massively more
useful. The tighter definition of what feedback actually *means*
provides far more information to the maintainer, and enables us to
automate certain outcomes much more aggressively.
For me, one the principal benefits of such a system would be that we
could make the 'This update breaks critical path functionality' checkbox
an absolutely red flashing light, wailing siren emergency button. I
mean, you hit that thing and trucks roll from Fedora HQ, metaphorically
speaking. It would have a confirmation page which clearly described the
impact of asserting that an update broke critpath, so we could be
confident that it didn't get falsely triggered very often. I'd suggest
that:
1. Any update that is marked as 'critpath breaking' by a FAS-registered
tester would be blocked from going any further in the update process
without manual intervention (no autopushes at all)
2. Any update marked as 'critpath breaking' by a proven tester would be
blocked from being pushed stable at all - automatically or manually -
until the PT modified the feedback or it was overridden by someone with
appropriately godlike powers (TBD, but probably not just the maintainer)
3. Any update marked as 'critpath breaking' should probably get
announced on at least test-announce and/or devel-announce
4. Any update marked as 'critpath breaking' *after it has already been
pushed* would similarly trigger a major response: notify maintainer very
hard, notify lists, generally do stuff to make sure it gets immediate
attention
We would also obviously be able to offer maintainers more nuanced
options for autopushing updates - require a certain number of passes on
a certain set of test cases, for instance.
To go a bit into the theory, the really nasty limitation of the current
system is we simply have very little easily consumable indication of
what the karma provided on an update *means*. A +1 can mean anything
from 'I booted and nothing exploded' to 'I tested every line of this
code, then did it backwards standing on my head'. A -1 can mean 'I
booted this update and then my cat got sick, I think there's a
connection!', 'this update breaks $REALLY OBSCURE FEATURE Z', or 'this
update exploded my monitor'. We just _don't know_. You can get the
information from comments, usually, but that's obviously a complete
non-starter for any kind of programmatized or automated action based on
the feedback. In particular, we currently can't do anything dramatic
based on negative feedback because of the uncertainty around exactly
what it means. We do have a few policies about when to file negative
feedback, but even this only very slightly mitigates the problem - we
can't say 'only file negative feedback if the update breaks critpath',
that's just not feasible. So we can't identify the 'really really
negative' feedback and respond appropriate drastically to it without
causing lots of pain through false positives (or rather, false
negatives...)
With a more advanced feedback system we can identify the really big
problems and be much more aggressive about handling them, without
over-reacting to smaller problems. We could, correspondingly, be a bit
less strict about how much *positive* feedback you need to push an
update, if we can be a bit more confident that we'll definitely identify
the really bad problems through negative feedback.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora
http://www.happyassassin.net