On Thu, Sep 27, 2007 at 01:30:15PM -0500, Clark Williams wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Michael/Jesse,
We're seeing odd failures with mock and the orphanskill feature. What seems to happen is that the '; mock-helper orphanskill <rootdir>' string is tacked onto a command which is passed to do_chroot() and after the main command is run, an attempt is made to run mock-helper (which is not installed in the chroot). So people are seeing a "File not found" message after a successful command.
Now it looks like to me that part of the reason for an orphanskill was that the do() routine might hang until all the child processes are done, so I'm loathe to just run the orphanskill after the do_chroot() is finished (I suspect twisty lines of logic, all alike). Seems like we can do a couple of things:
- Copy mock-helper into each chroot, so it's available for orphanskill
- Back out the orphanskill logic and try again
Option #1 is somewhat easy, if kinda ugly (not sure I like the idea of scattering a setuid-root program into all our build roots). Option #2 requires that we look at the code in all the do_* and do() routines to make sure that orphanskill runs when we need it to. Ideally I'd like to insure that orphanskill runs *outside* the chroot and that it's not needed to keep self.do() from hanging.
What you guys think?
How about we just run two commands in a row? I see the comment but don't really see why. Line 973, we dont need to run orphanskill if it isnt chroot. For the my.do_chroot() on line 975, it looks like we could just do a my.do_chroot() followed by a normal os.system().
The problem it is trying to fix is if the rpmbuild process spawns child processes that fork and never exit. I believe this was seen in some code that was running in the rpmbuild as a unit test?
We should also be cc-ing fedora-buildsys-list. (done)
I also understand Jesse's sentiment to just back it out. If it is going to take more than a day or two to fix, we could just back it out. -- Michael
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Resent from my Red Hat account so the message isn't subject to moderation...
Michael E Brown wrote:
On Thu, Sep 27, 2007 at 01:30:15PM -0500, Clark Williams wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Michael/Jesse,
We're seeing odd failures with mock and the orphanskill feature. What
seems to happen
is that the '; mock-helper orphanskill <rootdir>' string is tacked
onto a command
which is passed to do_chroot() and after the main command is run, an
attempt is made
to run mock-helper (which is not installed in the chroot). So people
are seeing a
"File not found" message after a successful command.
Now it looks like to me that part of the reason for an orphanskill was
that the do()
routine might hang until all the child processes are done, so I'm
loathe to just run
the orphanskill after the do_chroot() is finished (I suspect twisty
lines of logic,
all alike). Seems like we can do a couple of things:
- Copy mock-helper into each chroot, so it's available for orphanskill
- Back out the orphanskill logic and try again
Option #1 is somewhat easy, if kinda ugly (not sure I like the idea of
scattering a
setuid-root program into all our build roots). Option #2 requires that
we look at the
code in all the do_* and do() routines to make sure that orphanskill
runs when we
need it to. Ideally I'd like to insure that orphanskill runs *outside*
the chroot and
that it's not needed to keep self.do() from hanging.
What you guys think?
How about we just run two commands in a row? I see the comment but don't really see why. Line 973, we dont need to run orphanskill if it isnt chroot. For the my.do_chroot() on line 975, it looks like we could just do a my.do_chroot() followed by a normal os.system().
That was my first thought, but I was concerned that we might be missing something
subtle in the timeout code (hence my email to you :)).
If you think we can just run the orphanskill stuff after running the do_chroot() then I say that's the way to go.
The problem it is trying to fix is if the rpmbuild process spawns child processes that fork and never exit. I believe this was seen in some code that was running in the rpmbuild as a unit test?
We should also be cc-ing fedora-buildsys-list. (done)
wups (looks shamefaced)
I also understand Jesse's sentiment to just back it out. If it is going to take more than a day or two to fix, we could just back it out.
Let's try running it right after the rpmbuild. If that doesn't work right, then we can just comment it out while we take a closer look at it.
Clark
buildsys@lists.fedoraproject.org