Dear List,
I have a working koji setup with several clients. When I submit a few jobs, it works great. However when I submit lots of package (>100), some of the builds fail with the following error when mock is populating the buildroot (in root.log):
DEBUG util.py:264: Installed: DEBUG util.py:264: bash.x86_64 0:4.1.2-15.el6_4 DEBUG util.py:264: bzip2.x86_64 0:1.0.5-7.el6_0 DEBUG util.py:264: coreutils.x86_64 0:8.4-19.el6_4.2 .... DEBUG util.py:264: xz.x86_64 0:4.999.9-0.5.beta.20091007git.slxrv3 DEBUG util.py:264: Dependency Installed: DEBUG util.py:264: audit-libs.x86_64 0:2.2-2.el6 DEBUG util.py:264: basesystem.noarch 0:10.0-4.el6 .... DEBUG util.py:264: xz-lzma-compat.x86_64 0:4.999.9-0.5.beta.20091007git.slxrv3 DEBUG util.py:264: zlib.x86_64 0:1.2.3-29.el6 DEBUG util.py:354: Child return code was: 0 DEBUG util.py:314: Executing command: /usr/bin/repoquery -c /var/lib/mock/dist-slxrv3-build-5544-1386/root//etc/yum.conf -a --qf '%{nevra} %{buildtime} %{size} %{pkgid} %{repoid}' > /var/lib/mock/dist-slxrv3-build-5544-1386/result/available_pkgs with env {'LANG': 'en_US.UTF-8', 'TERM': 'vt100', 'SHELL': '/bin/bash', 'HOSTNAME': 'mock', 'HOME': '/builddir', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin'} DEBUG util.py:264: Traceback (most recent call last): DEBUG util.py:264: File "/usr/bin/repoquery", line 1510, in <module> DEBUG util.py:264: main(sys.argv) DEBUG util.py:264: File "/usr/bin/repoquery", line 1504, in main DEBUG util.py:264: repoq.runQuery(regexs) DEBUG util.py:264: File "/usr/bin/repoquery", line 982, in runQuery DEBUG util.py:264: pkgs = self.matchPkgs(items, plain_pkgs=plain_pkgs) DEBUG util.py:264: File "/usr/bin/repoquery", line 903, in matchPkgs DEBUG util.py:264: pkgs = self.returnPkgList(patterns=items) DEBUG util.py:264: File "/usr/bin/repoquery", line 856, in returnPkgList DEBUG util.py:264: pkgs = self.pkgSack.returnNewestByNameArch(**kwargs) DEBUG util.py:264: File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 1013, in <lambda> DEBUG util.py:264: pkgSack = property(fget=lambda self: self._getSacks(), DEBUG util.py:264: File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 779, in _getSacks DEBUG util.py:264: self.repos.populateSack(which=repos) DEBUG util.py:264: File "/usr/lib/python2.7/site-packages/yum/repos.py", line 309, in populateSack DEBUG util.py:264: self.doSetup() DEBUG util.py:264: File "/usr/lib/python2.7/site-packages/yum/repos.py", line 134, in doSetup DEBUG util.py:264: self.retrieveAllMD() DEBUG util.py:264: File "/usr/lib/python2.7/site-packages/yum/repos.py", line 92, in retrieveAllMD DEBUG util.py:264: downloading = repo._commonRetrieveDataMD_list(mdtypes) DEBUG util.py:264: File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 1540, in _commonRetrieveDataMD_list DEBUG util.py:264: os.rename(local, local + '.old.tmp') DEBUG util.py:264: OSError: [Errno 2] No such file or directory DEBUG util.py:354: Child return code was: 1 DEBUG util.py:314: Executing command: ['/bin/umount', '-n', '-l', '/var/lib/mock/dist-slxrv3-build-5544-1386/root/dev/pts'] with env {'LANG': 'en_US.UTF-8', 'TERM': 'vt100', 'SHELL': '/bin/bash', 'HOSTNAME': 'mock', 'HOME': '/builddir', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin'}
Is it because newRepo task is not finished but somehow creates the softlink to latest (and unfinished) koji repo? Have anyone seen the same error before?
Thanks for your reply, Peter Bojtos
On 10/27/2013 05:36 AM, Bojtos Péter wrote:
Dear List,
I have a working koji setup with several clients. When I submit a few jobs, it works great. However when I submit lots of package (>100), some of the builds fail with the following error when mock is populating the buildroot (in root.log):
DEBUG util.py:264: Installed: DEBUG util.py:264: bash.x86_64 0:4.1.2-15.el6_4 DEBUG util.py:264: bzip2.x86_64 0:1.0.5-7.el6_0 DEBUG util.py:264: coreutils.x86_64 0:8.4-19.el6_4.2 .... DEBUG util.py:264: xz.x86_64 0:4.999.9-0.5.beta.20091007git.slxrv3 DEBUG util.py:264: Dependency Installed: DEBUG util.py:264: audit-libs.x86_64 0:2.2-2.el6 DEBUG util.py:264: basesystem.noarch 0:10.0-4.el6 .... DEBUG util.py:264: xz-lzma-compat.x86_64 0:4.999.9-0.5.beta.20091007git.slxrv3 DEBUG util.py:264: zlib.x86_64 0:1.2.3-29.el6
Is this really all that is getting installed in your chroot? If so, I suspect this is the root of the problem. You probably need to adjust your groups data for the build tag.
See what the following commands gives: # koji list-groups dist-slxrv3-build
You need to have at least some basic packages listed in the build and srpm-build groups. For reference, see what similar commands against koji.fedoraproject.org give.
Is it because newRepo task is not finished but somehow creates the softlink to latest (and unfinished) koji repo? Have anyone seen the same error before?
Highly unlikely. Each newRepo task creates an entirely new repo (the old one stays around for a while until kojira clears it). When a build starts, it asks the hub for the current active repo, so there should be no way for a build to recieve a repo that is still being created.
Of course, the repo could be broken in many other ways.
Thanks for your reply, Peter Bojtos -- buildsys mailing list buildsys@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/buildsys
The installed packages are the following:
[kojiadmin@koji1 ~]$ koji list-groups dist-slxrv3-build build [dist-slxrv3] bash: None, default [dist-slxrv3-build] bzip2: None, default [dist-slxrv3-build] coreutils: None, default [dist-slxrv3-build] cpio: None, default [dist-slxrv3-build] diffutils: None, default [dist-slxrv3-build] findutils: None, default [dist-slxrv3-build] gawk: None, default [dist-slxrv3-build] gcc: None, default [dist-slxrv3-build] gcc-c++: None, default [dist-slxrv3-build] grep: None, default [dist-slxrv3-build] gzip: None, default [dist-slxrv3-build] info: None, default [dist-slxrv3-build] make: None, default [dist-slxrv3-build] patch: None, default [dist-slxrv3-build] python-libs: None, default [dist-slxrv3-build] redhat-rpm-config: None, default [dist-slxrv3-build] rpm: None, default [dist-slxrv3-build] rpm-build: None, default [dist-slxrv3-build] rpm-libs: None, default [dist-slxrv3-build] sed: None, default [dist-slxrv3-build] shadow-utils: None, default [dist-slxrv3-build] sulixerver-release: None, default [dist-slxrv3-build] tar: None, default [dist-slxrv3-build] unzip: None, default [dist-slxrv3-build] ustr: None, default [dist-slxrv3-build] util-linux: None, default [dist-slxrv3-build] which: None, default [dist-slxrv3-build] xz: None, default [dist-slxrv3-build] srpm-build [dist-slxrv3]
The srpm-build group is empty. Can it be the problem? Why is that 90% of the builds are working, but some of them fail at the buildroot creation.
Peter
On 10/27/2013 05:36 AM, Bojtos Péter wrote:
Dear List,
I have a working koji setup with several clients. When I submit a few jobs, it works great. However when I submit lots of package (>100), some of the builds fail with the following error when mock is populating the buildroot (in root.log):
DEBUG util.py:264: Installed: DEBUG util.py:264: bash.x86_64 0:4.1.2-15.el6_4 DEBUG util.py:264: bzip2.x86_64 0:1.0.5-7.el6_0 DEBUG util.py:264: coreutils.x86_64 0:8.4-19.el6_4.2 .... DEBUG util.py:264: xz.x86_64 0:4.999.9-0.5.beta.20091007git.slxrv3 DEBUG util.py:264: Dependency Installed: DEBUG util.py:264: audit-libs.x86_64 0:2.2-2.el6 DEBUG util.py:264: basesystem.noarch 0:10.0-4.el6 .... DEBUG util.py:264: xz-lzma-compat.x86_64 0:4.999.9-0.5.beta.20091007git.slxrv3 DEBUG util.py:264: zlib.x86_64 0:1.2.3-29.el6
Is this really all that is getting installed in your chroot? If so, I suspect this is the root of the problem. You probably need to adjust your groups data for the build tag.
See what the following commands gives: # koji list-groups dist-slxrv3-build
You need to have at least some basic packages listed in the build and srpm-build groups. For reference, see what similar commands against koji.fedoraproject.org give.
Is it because newRepo task is not finished but somehow creates the softlink to latest (and unfinished) koji repo? Have anyone seen the same error before?
Highly unlikely. Each newRepo task creates an entirely new repo (the old one stays around for a while until kojira clears it). When a build starts, it asks the hub for the current active repo, so there should be no way for a build to recieve a repo that is still being created.
Of course, the repo could be broken in many other ways.
Thanks for your reply, Peter Bojtos -- buildsys mailing list buildsys@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/buildsys
-- buildsys mailing list buildsys@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/buildsys
On 10/29/2013 04:59 AM, Bojtos Péter wrote:
The installed packages are the following:
[kojiadmin@koji1 ~]$ koji list-groups dist-slxrv3-build build [dist-slxrv3] bash: None, default [dist-slxrv3-build] bzip2: None, default [dist-slxrv3-build] coreutils: None, default [dist-slxrv3-build] cpio: None, default [dist-slxrv3-build] diffutils: None, default [dist-slxrv3-build] findutils: None, default [dist-slxrv3-build] gawk: None, default [dist-slxrv3-build] gcc: None, default [dist-slxrv3-build] gcc-c++: None, default [dist-slxrv3-build] grep: None, default [dist-slxrv3-build] gzip: None, default [dist-slxrv3-build] info: None, default [dist-slxrv3-build] make: None, default [dist-slxrv3-build] patch: None, default [dist-slxrv3-build] python-libs: None, default [dist-slxrv3-build] redhat-rpm-config: None, default [dist-slxrv3-build] rpm: None, default [dist-slxrv3-build] rpm-build: None, default [dist-slxrv3-build] rpm-libs: None, default [dist-slxrv3-build] sed: None, default [dist-slxrv3-build] shadow-utils: None, default [dist-slxrv3-build] sulixerver-release: None, default [dist-slxrv3-build] tar: None, default [dist-slxrv3-build] unzip: None, default [dist-slxrv3-build] ustr: None, default [dist-slxrv3-build] util-linux: None, default [dist-slxrv3-build] which: None, default [dist-slxrv3-build] xz: None, default [dist-slxrv3-build] srpm-build [dist-slxrv3]
The srpm-build group is empty. Can it be the problem? Why is that 90% of the builds are working, but some of them fail at the buildroot creation.
Are you building from srpm or from scm? If you are building from scm, then you must have the srpm-build group specified (this group is used to create the buildroot that makes the initial srpm from the scm content).
All of the builds are from src.rpm, we're not using scm (yet). What can be the root of the problem?
----- Eredeti üzenet ----- Feladó: "Mike McLean" mikem@redhat.com Címzett: buildsys@lists.fedoraproject.org Elküldött üzenetek: Kedd, 2013. Október 29. 16:23:17 Tárgy: Re: strange build errors
On 10/29/2013 04:59 AM, Bojtos Péter wrote:
The installed packages are the following:
[kojiadmin@koji1 ~]$ koji list-groups dist-slxrv3-build build [dist-slxrv3] bash: None, default [dist-slxrv3-build] bzip2: None, default [dist-slxrv3-build] coreutils: None, default [dist-slxrv3-build] cpio: None, default [dist-slxrv3-build] diffutils: None, default [dist-slxrv3-build] findutils: None, default [dist-slxrv3-build] gawk: None, default [dist-slxrv3-build] gcc: None, default [dist-slxrv3-build] gcc-c++: None, default [dist-slxrv3-build] grep: None, default [dist-slxrv3-build] gzip: None, default [dist-slxrv3-build] info: None, default [dist-slxrv3-build] make: None, default [dist-slxrv3-build] patch: None, default [dist-slxrv3-build] python-libs: None, default [dist-slxrv3-build] redhat-rpm-config: None, default [dist-slxrv3-build] rpm: None, default [dist-slxrv3-build] rpm-build: None, default [dist-slxrv3-build] rpm-libs: None, default [dist-slxrv3-build] sed: None, default [dist-slxrv3-build] shadow-utils: None, default [dist-slxrv3-build] sulixerver-release: None, default [dist-slxrv3-build] tar: None, default [dist-slxrv3-build] unzip: None, default [dist-slxrv3-build] ustr: None, default [dist-slxrv3-build] util-linux: None, default [dist-slxrv3-build] which: None, default [dist-slxrv3-build] xz: None, default [dist-slxrv3-build] srpm-build [dist-slxrv3]
The srpm-build group is empty. Can it be the problem? Why is that 90% of the builds are working, but some of them fail at the buildroot creation.
Are you building from srpm or from scm? If you are building from scm, then you must have the srpm-build group specified (this group is used to create the buildroot that makes the initial srpm from the scm content).
-- buildsys mailing list buildsys@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/buildsys
----- Original Message -----
From: "Mike McLean" mikem@redhat.com On 10/27/2013 05:36 AM, Bojtos Péter wrote:
I have a working koji setup with several clients. When I submit a few jobs, it works great. However when I submit lots of package (>100), some of the builds fail with the following error when mock is populating the buildroot (in root.log):
Is it because newRepo task is not finished but somehow creates the softlink to latest (and unfinished) koji repo? Have anyone seen the same error before?
Highly unlikely. Each newRepo task creates an entirely new repo (the old one stays around for a while until kojira clears it). When a build starts, it asks the hub for the current active repo, so there should be no way for a build to recieve a repo that is still being created.
It could be the other way around though -- the build could be trying to use an out of date repo that is being deleted...
Something like:
a) submit 100 jobs b) build 1 starts, ..., completes c) build 2 starts, compile begins d) build 3 starts, build task grabs the latest build repo id, passes that to buildArch e) kojira triggers newRepo task f) buildArch remains on hold as all builders are busy g) newRepo finishes h) buildArch 3 begins actually doing things, working with old repo i) kojira starts deleting old repos j) buildArch 3 fails because its repo is no longer there
If you're not doing chain-builds, you could just turn off kojira while the 100 package build is happening. If you are doing chain-builds, this shouldn't be an issue because each build task won't get started until the newRepo is completed anyway, so the buildArch tasks will always be looking at the most recent repo.
If that's really the problem, I guess the long-term fix would be for kojira to look for any active tasks referencing a repo before scheduling it for deletion?
Cheers, aj
----- Original Message -----
From: "Mike McLean" mikem@redhat.com On 10/27/2013 05:36 AM, Bojtos Péter wrote:
I have a working koji setup with several clients. When I submit a few jobs, it works great. However when I submit lots of package (>100), some of the builds fail with the following error when mock is populating the buildroot (in root.log):
Is it because newRepo task is not finished but somehow creates the softlink to latest (and unfinished) koji repo? Have anyone seen the same error before?
Highly unlikely. Each newRepo task creates an entirely new repo (the old one stays around for a while until kojira clears it). When a build starts, it asks the hub for the current active repo, so there should be no way for a build to recieve a repo that is still being created.
It could be the other way around though -- the build could be trying to use an out of date repo that is being deleted...
Something like:
a) submit 100 jobs b) build 1 starts, ..., completes c) build 2 starts, compile begins d) build 3 starts, build task grabs the latest build repo id, passes that to buildArch e) kojira triggers newRepo task f) buildArch remains on hold as all builders are busy g) newRepo finishes h) buildArch 3 begins actually doing things, working with old repo i) kojira starts deleting old repos j) buildArch 3 fails because its repo is no longer there
If you're not doing chain-builds, you could just turn off kojira while the 100 package build is happening. If you are doing chain-builds, this shouldn't be an issue because each build task won't get started until the newRepo is completed anyway, so the buildArch tasks will always be looking at the most recent repo.
If that's really the problem, I guess the long-term fix would be for kojira to look for any active tasks referencing a repo before scheduling it for deletion?
Cheers, aj
This is an interesting case, but this is very unlikely. I've set up kojira to keep repositories for 1 day with the following in the kojira config: deleted_repo_lifetime = 86400
It means that the build task should be picked up by a build host and the buildArch task should be picked up 1 day later. Anyway I'm going to set it to 1 week and see, what happens.
Cheers, Peter
-- Anthony Towns atowns@redhat.com Red Hat Release Engineering -- buildsys mailing list buildsys@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/buildsys
buildsys@lists.fedoraproject.org