So, many pieces of this have been blabbed about by me and others
recently and in the past, but I think a lot of people aren't seeing the
big picture. I think it's time I fit it all together.
Problem: Fedora is buggy, and updates are rife with regressions.
Solution: We need more and wider testing.
Problem: We need more and wider testing. Why don't we get more testing?
Thw reason *I* don't go out of my way to test updates, is if there's a
regression, it's such a pain in the ass to get the system back to a
known good state and keep it that way, and report a bug. If it's a pain
in the ass for me, it's impossible for Aunt Tillie.
Solution:
1) Make it easy to report bugs. Bugzilla is complex, slow, and
inscrutable. We need to put a simpler layer on top of it. Reporting a
bug should require just a few clicks. It should automatically include
all the information needed for the bug report, the distro version,
package version, arch, and things such as how the Xorg team demands your
xorg.conf and Xorg.0.log. Make finding dupes easier. Collect stack
traces system wide and enter them in a database, which bugzilla can
reference and from which bugzilla bugs can be derived. A system wide
kerneloops. (I know this has been talked about, what's the status?)
2) Make it simple to roll back to a known good state. We need a "system
restore". I know what you're thinking, but our vastly superior,
centralized, system-wide package management (and lack of a whole
seperate "system registry" namespace) allows us to make this actually
work. We need per-package rollback. Period.
3) Make it so yum will refuse to upgrade the package rolled back in step
2 until the bug reported in step 1 is fixed.
4) For when things go really wrong, we need a rescue image in /boot. All
the above functionality must be available inside the rescue environment.
5) So how do bugs get fixed? Make it easy to cherry pick updates from
updates-testing or even direct from bodhi. How useful is it to blindly
follow every update in updates testing? For most users, it's useless.
Such adventurous people should probably just run rawhide... What we
really need is to make it easy to pick a specific release of an update
to try, such as if a user is directed to in a bug report. A user should
just be able to click on a link given in the bug report and have the
update installed. Available updates and the reasons for them needs to be
more discoverable. Don't forget step #2.
See how these things build upon and support each other?
Notice here I'm talking purely about user interface, about the end user
experience. The technical infrastructure follows from this, and I'll
save that discussion for another message. Infrastructure supports
functionality, not the other way around. I don't want to hear any "Oh
but we can't do this because blah blah technical objection blah makes
this hard". I hereby dub this the "Hard problem fallacy". Engineers
solve hard problems, that's what we do. I want to hear "To do this we
need to do x y z". The only objections I will accept are of the form
"You are an idiot and your ideas are stupid. We're not doing this."
I should probably put this on the Wiki so it doesn't get lost...