Hello,
before Christmas QA started discussing changes in the release blocker process for bugs that are not media-related (i.e. don't need to be fixed in the generated .iso/.img files). The conversation is available here: https://lists.fedoraproject.org/archives/list/test%40lists.fedoraproject.org...
I took interest in one specific type of such bugs, which are release blockers in one of the existing stable releases (i.e. Branched-1 or Branched-2). These are usually related to the upgrade process. We later decided to mark them with AcceptedPreviousRelease flag in Bugzilla, so I'll call them like this. This email is concerned just with these bugs, i.e. just one type of the non-media-related blockers. (I want to separate the overall topic into several separate discussions, so that it's easier to talk about it and we don't muddy the discussion with too many things at once).
For these AcceptedPreviousRelease blockers, we want to make sure that all of this happens before we announce Branched release: a) their updates are pushed to Branched-1 or Branched-2 updates repo b) the contents of those repos are available to all our users, also considering all caches and refresh intervals
Ensuring a) is easy and will be tracked by QA and Bodhi as usual, but for b) we will need help from releng/infra. Since many users upgrade immediately on release day, we really need to make sure the updates are available for everybody by that time, otherwise a large portion of our user base will get affected by those blocker bugs. I have tried to investigate how best to ensure the latest repo contents are available to everyone, and I described it here: https://lists.fedoraproject.org/archives/list/test%40lists.fedoraproject.org...
As the easiest way to achieve this, I suggested we create a new MirrorManager-related tool which will strip all old metadata from the repo metalink. Dennis from RelEng agreed he could run such a tool in these circumstances to make sure only latest metadata are distributed. I suppose the tool could work something like this:
1. A blocker update was pushed to f23-updates on 2015-01-07. 2. Releng will run the tool like this: $ ./mm-strip-old-metadata --release 23 --repo updates --date 2015-01-07 (This is assuming there is only one push per day, if not, maybe we can have --timestamp instead of --date). 3. The tool will go through all metalinks for f23-updates (all primary architectures), and drop each mm0:alternate section which has mm0:timestamp older than the provided date (let's say midnight UTC). 4. As a result, end-user machines will only connect to repositories which already serve the blocker update (have that or newer repo tree).
And now finally the important question - does infrastructure team thinks this is easily doable, or is there something not accounted for? Is here someone willing to create such a tool? In December, I talked to Adrian Reber and I got the impression he's a MirrorManager developer, so that's the only name I know of, but I don't want to bother anyone directly. Are there more of MM developers? Is anyone willing to help out with this?
Thanks a lot, Kamil
Hi,
...
- The tool will go through all metalinks for f23-updates (all primary
architectures), and drop each mm0:alternate section which has mm0:timestamp older than the provided date (let's say midnight UTC).
Please note that there are no metalink files: these files are generated on the fly from a cache by the mirrorlist servers. I have a patch for this that I'll submit upstream to add the feature itself, and will discuss with releng how they want to fire this off.
- As a result, end-user machines will only connect to repositories which
already serve the blocker update (have that or newer repo tree).
And now finally the important question - does infrastructure team thinks this is easily doable, or is there something not accounted for? Is here someone willing to create such a tool? In December, I talked to Adrian Reber and I got the impression he's a MirrorManager developer, so that's the only name I know of, but I don't want to bother anyone directly. Are there more of MM developers? Is anyone willing to help out with this?
Thanks a lot, Kamil
With kind regards, Patrick Uiterwijk Fedora Infra
- The tool will go through all metalinks for f23-updates (all primary
architectures), and drop each mm0:alternate section which has mm0:timestamp older than the provided date (let's say midnight UTC).
Please note that there are no metalink files: these files are generated on the fly from a cache by the mirrorlist servers. I have a patch for this that I'll submit upstream to add the feature itself, and will discuss with releng how they want to fire this off.
Thanks a lot, Patrick!
On Fri, Jan 08, 2016 at 02:40:52PM -0500, Patrick Uiterwijk wrote:
- The tool will go through all metalinks for f23-updates (all primary
architectures), and drop each mm0:alternate section which has mm0:timestamp older than the provided date (let's say midnight UTC).
Please note that there are no metalink files: these files are generated on the fly from a cache by the mirrorlist servers. I have a patch for this that I'll submit upstream to add the feature itself, and will discuss with releng how they want to fire this off.
The new script is now available on mm-backend01: mm2_emergency-expire-repo
I think I have seen that Patrick also created a playbook to let the script run in the correct environment and configuration.
We have successfully tested the script in the staging environment and after it runs only the newest repomd.xml file is listed in the metalink.
As far as I understand MirrorManager this script only changes the number of alternate repomd.xml files in the metalink. The number of mirrors returned does not change. Depending on the last run of the master mirror crawler (umdl), the state of the crawler checking the mirrors and the mirrors which are running report_mirror this may lead to situation where mirrors are offered to clients which might not yet have the newest files.
Adrian
On Fri, Jan 08, 2016 at 02:40:52PM -0500, Patrick Uiterwijk wrote:
- The tool will go through all metalinks for f23-updates (all primary
architectures), and drop each mm0:alternate section which has mm0:timestamp older than the provided date (let's say midnight UTC).
Please note that there are no metalink files: these files are generated on the fly from a cache by the mirrorlist servers. I have a patch for this that I'll submit upstream to add the feature itself, and will discuss with releng how they want to fire this off.
The new script is now available on mm-backend01: mm2_emergency-expire-repo
I think I have seen that Patrick also created a playbook to let the script run in the correct environment and configuration.
We have successfully tested the script in the staging environment and after it runs only the newest repomd.xml file is listed in the metalink.
As far as I understand MirrorManager this script only changes the number of alternate repomd.xml files in the metalink. The number of mirrors returned does not change. Depending on the last run of the master mirror crawler (umdl), the state of the crawler checking the mirrors and the mirrors which are running report_mirror this may lead to situation where mirrors are offered to clients which might not yet have the newest files.
In other words, some of those repos included in the metalink might not correspond to the included repomd.xml hash, right?
Is that a problem? I believe DNF should just skip those repos and find the first one which matches the provided metadata hash.
On Mon, Feb 08, 2016 at 07:43:14AM -0500, Kamil Paral wrote:
On Fri, Jan 08, 2016 at 02:40:52PM -0500, Patrick Uiterwijk wrote:
- The tool will go through all metalinks for f23-updates (all primary
architectures), and drop each mm0:alternate section which has mm0:timestamp older than the provided date (let's say midnight UTC).
Please note that there are no metalink files: these files are generated on the fly from a cache by the mirrorlist servers. I have a patch for this that I'll submit upstream to add the feature itself, and will discuss with releng how they want to fire this off.
The new script is now available on mm-backend01: mm2_emergency-expire-repo
I think I have seen that Patrick also created a playbook to let the script run in the correct environment and configuration.
We have successfully tested the script in the staging environment and after it runs only the newest repomd.xml file is listed in the metalink.
As far as I understand MirrorManager this script only changes the number of alternate repomd.xml files in the metalink. The number of mirrors returned does not change. Depending on the last run of the master mirror crawler (umdl), the state of the crawler checking the mirrors and the mirrors which are running report_mirror this may lead to situation where mirrors are offered to clients which might not yet have the newest files.
In other words, some of those repos included in the metalink might not correspond to the included repomd.xml hash, right?
Correct.
Is that a problem? I believe DNF should just skip those repos and find the first one which matches the provided metadata hash.
It shouldn't be a problem. It is a situation which can happen anytime. Just wanted to mention it once more.
Adrian
infrastructure@lists.fedoraproject.org