tarballs vs ABI xml files in dist-git.

List overview All Threads
Download

newer

older

Source RPM size

Cleaning up Fedora glibc...

Carlos O'Donell

2 Mar 2018 2 Mar '18

5 p.m.

Florian,

You expressed some worry about checking in the tarballs for the frozen ABI specification into dist-git. Really git is not designed to hold tarballs, and the source cache is just wrong since this is not a source tarball (we used to abuse it also for the releng tarball).

It is relatively easy to move all the files directly into dist-git, and use a lua construct like this:

+%{lua: +-- List all of the frozen ABI xml files as source files. +function recursedir(directory) + local i, t, popen = 0, {}, io.popen + local pfile = popen('find "'..directory..'" -type f') + for filename in pfile:lines() do + i = i + 1 + t[i] = filename + end + pfile:close() + return t +end +# There are almost 2000 ABI specification files. +lines = recursedir ('releng/frozen-abi/') +-- Last Source file is numbered 12. +j = 12 +for i,v in ipairs(lines) do + print('Source'..i..': '..v) + j = j + 1 +end +} +%endif

Which generates the source line entries from the directories. This would allow us to manually tweak the ABI files by hand and track the changes over time as we update glibc.

Would this be a better design?

-- Cheers, Carlos.

Attachments:

frozen-abi-xml.diff (text/x-patch — 3.4 KB)

Show replies by date

Carlos O'Donell

2 Mar 2 Mar

5:19 p.m.

On 03/02/2018 03:00 PM, Carlos O'Donell wrote:

...

Florian,

You expressed some worry about checking in the tarballs for the frozen ABI specification into dist-git. Really git is not designed to hold tarballs, and the source cache is just wrong since this is not a source tarball (we used to abuse it also for the releng tarball).

It is relatively easy to move all the files directly into dist-git, and use a lua construct like this:

+%{lua: +-- List all of the frozen ABI xml files as source files. +function recursedir(directory)
local i, t, popen = 0, {}, io.popen

local pfile = popen('find "'..directory..'" -type f')

for filename in pfile:lines() do
   i = i + 1
   t[i] = filename
end

pfile:close()

return t
+end +# There are almost 2000 ABI specification files. +lines = recursedir ('releng/frozen-abi/') +-- Last Source file is numbered 12. +j = 12 +for i,v in ipairs(lines) do

print('Source'..i..': '..v)

j = j + 1

+end +} +%endif

Which generates the source line entries from the directories. This would allow us to manually tweak the ABI files by hand and track the changes over time as we update glibc.

Would this be a better design?

... and this doesn't work at *all*.

The "SourceN: path/file" definitions have path/ stripped leaving only file.

-- Cheers, Carlos.

Florian Weimer

5 Mar 5 Mar

3:24 a.m.

On 03/03/2018 12:19 AM, Carlos O'Donell wrote:

...

...
Which generates the source line entries from the directories. This would allow us to manually tweak the ABI files by hand and track the changes over time as we update glibc.

Would this be a better design?

... and this doesn't work at*all*.

The "SourceN: path/file" definitions have path/ stripped leaving only file.

Right. I looked at the RPM sources a while back and could not find a way to make this work at all.

Thanks, Florian

Florian Weimer

3:25 a.m.

On 03/03/2018 12:00 AM, Carlos O'Donell wrote:

...

You expressed some worry about checking in the tarballs for the frozen ABI specification into dist-git. Really git is not designed to hold tarballs, and the source cache is just wrong since this is not a source tarball (we used to abuse it also for the releng tarball).

We really need the individual input files under version control. Otherwise, changes will impossible to review.

So there has to be a repository somewhere with the data.

There is already a project for ABI checks in Fedora. Could we integrate with that?

Thanks, Florian

Carlos O'Donell

12:48 p.m.

On 03/05/2018 01:25 AM, Florian Weimer wrote:

...

On 03/03/2018 12:00 AM, Carlos O'Donell wrote:

...
You expressed some worry about checking in the tarballs for the frozen ABI specification into dist-git. Really git is not designed to hold tarballs, and the source cache is just wrong since this is not a source tarball (we used to abuse it also for the releng tarball).

We really need the individual input files under version control. Otherwise, changes will impossible to review.

So there has to be a repository somewhere with the data.

Agreed.

...

There is already a project for ABI checks in Fedora. Could we integrate with that?

This is taskotron. Taskotron already has dist.abicheck, and it is run against glibc using abipkgdiff.

Empirically I think it is too late at taskotron for developer tooling, and I'd like to be able to give us immediate per-patch feedback as we develop our work, particularly if we are going to do more automated patch backporting using our tooling.

I am going to suggest the following, and tell me if you think it is a good solution:

1. Create a pagure.io upstream project called 'glibc-abi' which has serialized ABI details for each target arch we care about, matching one released from upstream: https://sourceware.org/glibc/wiki/ABIList

2. Create branches in 'glibc-abi' which match upstream glibc releases, and also create public Fedora or RHEL branches as required, the ABI is public anyway and serves as a touch-point for anyone wanting to consume metadata about our published ABI (internal and external).

3. Consume tarballs of the 'glibc-abi' branches in downsteram Fedora glibc, RHEL, CentOS etc., so we would have 2 source tarballs (glibc, glibc-abi).

We avoid creating a new package in downstream distros to package the data, we just consume it directly from the glibc-abi branches.

We would discuss, merge, and update glibc-abi project to track the various ABIs of the downstream branched releases.

From a bug tracking perspective we would need to file bugs to rebase the glibc-abi data from upstream if we needed to pull in new data from glibc-abi for say Fedora 28.

Does that make sense?

-- Cheers, Carlos.

Florian Weimer

6 Mar 6 Mar

5:16 a.m.

On 03/05/2018 07:48 PM, Carlos O'Donell wrote:

...

Empirically I think it is too late at taskotron for developer tooling, and I'd like to be able to give us immediate per-patch feedback as we develop our work, particularly if we are going to do more automated patch backporting using our tooling.

But we rarely have individual changes or patches in Fedora dist-git. Downstream, there is even a tendency to munge together unrelated patches in a single patch.

So it's still not clear to me what you are trying to achieve here. Based on your comments above, it looks like downstream is already too late?

Thanks, Florian

Carlos O'Donell

3:26 p.m.

On 03/06/2018 03:16 AM, Florian Weimer wrote:

...

On 03/05/2018 07:48 PM, Carlos O'Donell wrote:

...
Empirically I think it is too late at taskotron for developer tooling, and I'd like to be able to give us immediate per-patch feedback as we develop our work, particularly if we are going to do more automated patch backporting using our tooling.

But we rarely have individual changes or patches in Fedora dist-git. Downstream, there is even a tendency to munge together unrelated> patches in a single patch.

Hopefully my notes below help clarify my position.

I do not want to filibuster you, please feel free to respond to whichever points you think need clarification, and I can summarize again later.

(a) Few low-quality patches in Fedora.

We should not group together dissimilar patches, this is poor quality engineering. Patches should be split out logically based on what they implement.

This is a flaw in our existing fedora packages and not a flaw in what we should strive to attain in the quality of our work.

(b) Reasons for automated ABI testing in downstream.

1. We often trial out things in Fedora Rawhide with relatively large patches making sweeping changes, and test this in Rawhide. Examples include all of the P&C changes that I worked with Torvald on, and deployed to test in Rawhide. It would have been nice to have some automated check in Fedora Rawhide to compare ABIs in addition to the usual manual inspection, looking for anything out of the ordinary. For example we changed the internals of the various pthread structures and an immediate warning and review would be good if made a publicly visible ABI change.

I would like us to do more testing in Fedora Rawhide, and understanding the exact nature of the ABI change would be useful. It should be as easy as turning on ABI verification and looking at the logs, or compare them to either a previous run or to the glibc-abi project's baseline.

2. Enable Fedora Rawhide ABI freeze immediately after glibc upstream freezes, but keep syncing Fedora Rawhide weekly or daily to catch any last-minute ABI changes (internal or external) to review.

3. Enable more aggressive backporting between upstream master, upstream release, Fedora and RHEL branches.

3.1 Fedora<->Upstream release

We should aim for zero patches in Fedora dist-git in Rawhide (except for point 1 above, and immediate bug fixes or issues without own toolchains), but for the 3 other branches we maintain for released Fedora, we should be doing more active backporting to help Fedora. This active backporting has the benefit that it provides better coverage of fixes from master (very important!), but carries with it ABI risks.

My hope is that with your sync script, and automated ABI checking, that the released branches can be updated by anyone on the team with a higher confidence that the build is OK (requires we fail the builds for testsuite failures *and* ABI failures e.g. belt and suspenders, but we can fix the regression testsuite checking as a next step).

Having the assurance at the %install phase is crucial IMO, since it gives you immediate feedback (important for a good developer worflow).

e.g. * Cherry pick patch from master to local release branch X.Y. * Do a sync to fedora Z against a local release branch X.Y. * Push a scratch build and see if you get any failures across all arches in both tests and ABI. * No ABI failures? No test failures? OK, propose cherry pick upstream. * Commit to upstream release branch, and do the official sync.

Granted adding ABI checking adds complexity to the developer workflow, but if we never make mistakes, it should never really trigger, and only some of the team members need update, and sign off on new ABI at each release? :-)

3.2 Fedora<->Upstream master

At present we do simple syncs from upstream release branches to Fedora release branches, but we could do better to experiment with more aggressive models. Like gcc we could fix more things on the stable branches and having automated ABI checking would help.

For example consider all the work we do in RHEL to backport changes from upstream master to glibc 2.17. In RHEL 7 timeframe I backported all of the IN_MODULE() changes and had to verify at each stage that we didn't change the ABI. This was done by hand because we didn't have automation. With the new ABI verification I need only have done a build after applying each patch to verify it built.

3.3 Upstream release<->RHEL/CentOS

In RHEL and CentOS the ABI is frozen at GA, and having automated testing for ABI with this level of detail, integrated into %install, will help us ensure we make no mistakes. Even with libabigail integration into any late-phase content validation, we *need* this earlier during development of either product. We can't rely on upstream testing in this case because our ABI may have deviated slightly e.g. backported mix of GLIBC_PRIVATE changes.

...

So it's still not clear to me what you are trying to achieve here. Based on your comments above, it looks like downstream is already too late?

We need ABI testing at each stage to provide confidence that allows developers to work quicker, trusting in the regression tests to help them make sure fewer mistakes are made.

We absolutely need deeper ABI testing upstream but I can't do that yet without having had any of my own experience in Fedora and RHEL doing exactly what I want to recommend upstream.

-- Cheers, Carlos.

Florian Weimer

14 May 14 May

9:51 a.m.

On 03/06/2018 10:26 PM, Carlos O'Donell wrote:

...

We need ABI testing at each stage to provide confidence that allows developers to work quicker, trusting in the regression tests to help them make sure fewer mistakes are made.

This may be the case, but I'm even less convinced now that ABI test artifacts belong into the source package itself. We need external ABI test tooling, with proper test case management and review tools for failures. If we can't consume something that already exists, I don't think we have the resources to implement that, whether we put it into the package build itself or not.

Thanks, Florian

Carlos O'Donell

11:01 a.m.

On 05/14/2018 10:51 AM, Florian Weimer wrote:

...

On 03/06/2018 10:26 PM, Carlos O'Donell wrote:

...
We need ABI testing at each stage to provide confidence that allows developers to work quicker, trusting in the regression tests to help them make sure fewer mistakes are made.

This may be the case, but I'm even less convinced now that ABI test artifacts belong into the source package itself. We need external ABI test tooling, with proper test case management and review tools for failures. If we can't consume something that already exists, I don't think we have the resources to implement that, whether we put it into the package build itself or not.

How do we make incremental progress to get us there?

- External ABI test tooling. = Today this is abidw.

- Test case management. = There is just one test today. Compare all DSOs. = We could manually make one shell script per DSO and run that as a single test case. = pagure.io for glibc-abi is the place we could have ABI defects reported.

- Review tools for failures. = In the case of a failure you get full abidw output for the failed DSO in the glibc build logs. At that point, set save_abi to 1 and warn_abi to 1, and do a scratch build, fetch the results and compare offline with abidw (we may need to save a little more data)

Notes: - We could roll abidw as a *patch* against the upstream sources. I could do that, but I'm not yet ready to post upstream until we have some more experience with the tooling. It would effectively be a abidw-type test per directory that can install a DSO, and the test would expect an unpacked ABI snapshot in top of the glibc src directory or in a --with-abi-baseline=/path directory kind of configure option. This is just a refactoring of where tests go and how ready our patches are for inclusion upstream.

The current proposal looks like this:

* glibc-abi project: - Contains curated ABI artifacts for all arches for all upstream branches of interest. - Distinct pull-request process. - Distinct bug tracking. [ This project is live on pagure.io ]

* glibc project: - Consumes a copy of the branch of glibc-abi. - Can produce as output a list of ABI artifaces. - We should refactor the glibc.spec parts to become scripts that are in the glibc-abi project instead of the spec file. - glibc.spec uses glibc-abi sources to verify ABI.

Steps:

fedpkg clone glibc git clone glibc-abi cd glibc [do your work] [set glibc.spec to save_abi and warn_abi] fedpkg build --scratch --srpm glibc.src.rpm [look for ABI warnings, if there are none, no ABI work required, skip to (2)] [if there are ABI changes, fetch binary glibc.rpms (where ABI artifacting is saved)] cd ../glibc-abi tar zxvf ../glibc/[downloaded ABI artifact files]* [review ABI changes, commit, push] make dist cp glibc-abi*.tar.gz ../glibc/ [add new ABI artifacts file as new source] (2) [download logs, verify, set save_abi=0 warn_abi=0 verify_abi=1] [commit changes] fedpkg build

-- Cheers, Carlos.

DJ Delorie

12:24 p.m.

Is there any reason why ABI testing can't be in its own package? I.e. a separate group/git/etc that contains a testsuite that verifies the ABI is consistent. I bit of separation from the glibc project itself helps keep the tests rigorous, and reduces dependencies wrt glibc itself.

You also want to run the tests more often than "once per build" sometimes, and a separate package can do that as well as test the installed ABI.

Florian Weimer

12:56 p.m.

On 05/14/2018 07:24 PM, DJ Delorie wrote:

...

Is there any reason why ABI testing can't be in its own package? I.e. a separate group/git/etc that contains a testsuite that verifies the ABI is consistent. I bit of separation from the glibc project itself helps keep the tests rigorous, and reduces dependencies wrt glibc itself.

You also want to run the tests more often than "once per build" sometimes, and a separate package can do that as well as test the installed ABI.

That's actually a very good idea. We may want to run the Fedora 27 ABI definition against the Fedora 28 library, too.

Thanks, Florian

Carlos O'Donell

2:10 p.m.

On 05/14/2018 01:56 PM, Florian Weimer wrote:

...

On 05/14/2018 07:24 PM, DJ Delorie wrote:

...
Is there any reason why ABI testing can't be in its own package? I.e. a separate group/git/etc that contains a testsuite that verifies the ABI is consistent. I bit of separation from the glibc project itself helps keep the tests rigorous, and reduces dependencies wrt glibc itself.

You also want to run the tests more often than "once per build" sometimes, and a separate package can do that as well as test the installed ABI.

That's actually a very good idea. We may want to run the Fedora 27 ABI definition against the Fedora 28 library, too.

I agree.

So I already have glibc-abi: https://pagure.io/glibc-abi/

As a concrete step I could move all scripting logic out of glibc.spec into a custom tool in glibc-abi *right now*, and it would minimize the changes in glibc-spec to just running the glibc-abi tooling.

Then you could run the tooling whenever you wanted?

-- Cheers, Carlos.

Carlos O'Donell

8 Mar 8 Mar

9:21 p.m.

On 03/05/2018 01:25 AM, Florian Weimer wrote:

...

On 03/03/2018 12:00 AM, Carlos O'Donell wrote:

...
You expressed some worry about checking in the tarballs for the frozen ABI specification into dist-git. Really git is not designed to hold tarballs, and the source cache is just wrong since this is not a source tarball (we used to abuse it also for the releng tarball).

We really need the individual input files under version control. Otherwise, changes will impossible to review.

So there has to be a repository somewhere with the data.

DJ suggested using lua to generate the tarball as-needed and never check it in. This is an interesting solution to the problem of not being allowed to have a directory structure in the SourceN: entries, which would retain dist-git history for the files but let you organize as required with directories e.g. conf/ for configuration files, install/ for install-time program sources, etc.

I like your git-bundle idea better since it yields a working git tree that gives you a lot more flexible patch automation using existing tooling (and history) and is self-hosting, and also has all the above benefits.

-- Cheers, Carlos.

Florian Weimer

9 Mar 9 Mar

8:38 a.m.

On 03/09/2018 04:21 AM, Carlos O'Donell wrote:

...

DJ suggested using lua to generate the tarball as-needed and never check it in. This is an interesting solution to the problem of not being allowed to have a directory structure in the SourceN: entries, which would retain dist-git history for the files but let you organize as required with directories e.g. conf/ for configuration files, install/ for install-time program sources, etc.

This assumes that the directory for the SourceN: files is writable. I don't think this is necessarily true. I doubt you can generate source file contents dynamically this way. In any case, you'll need separate hacks for local builds/prep, mock and COPR. I tried that and I did get quite far, but it's really hackish (including finding relevant directories by looking at /proc/self).

Thanks, Florian

Carlos O'Donell

9:30 a.m.

On 03/09/2018 08:38 AM, Florian Weimer wrote:

...

On 03/09/2018 04:21 AM, Carlos O'Donell wrote:

...
DJ suggested using lua to generate the tarball as-needed and never check it in. This is an interesting solution to the problem of not being allowed to have a directory structure in the SourceN: entries, which would retain dist-git history for the files but let you organize as required with directories e.g. conf/ for configuration files, install/ for install-time program sources, etc.

This assumes that the directory for the SourceN: files is writable. I don't think this is necessarily true. I doubt you can generate source file contents dynamically this way. In any case, you'll need separate hacks for local builds/prep, mock and COPR. I tried that and I did get quite far, but it's really hackish (including finding relevant directories by looking at /proc/self).

I agree it's a hack. I'm not going to pursue this. We need an upstream repo that is *not* dist-git, and a method to pull from it, and keep development history, like your git-bundle idea.

Cheers, Carlos.

2180

Age (days ago)

2253

Last active (days ago)

glibc@lists.fedoraproject.org

14 comments

3 participants

tags (0)

participants (3)

Carlos O'Donell
DJ Delorie
Florian Weimer