Does that work successfully in koji? Here, the mock Python processes survive the kill signal. Only the parent mock process terminates. The other ones keep building until the job succeeds or fails.
Michael Schwendt wrote:
Does that work successfully in koji? Here, the mock Python processes survive the kill signal. Only the parent mock process terminates. The other ones keep building until the job succeeds or fails.
Koji is pretty aggressive about killing off task processes.
https://hosted.fedoraproject.org/koji/browser/builder/kojid#L863
On Thu, 31 Jan 2008 15:30:19 -0500, Mike McLean wrote:
Does that work successfully in koji? Here, the mock Python processes survive the kill signal. Only the parent mock process terminates. The other ones keep building until the job succeeds or fails.
Koji is pretty aggressive about killing off task processes.
https://hosted.fedoraproject.org/koji/browser/builder/kojid#L863
Hmmm... but is it successful?
It finds more child process groups and kills them, but here it doesn't kill all of mock's processes either.
To take a closer look I've ported the code into Plague (on F8). Below is one snapshot of the /proc stats of the build processes. The code finds process groups 16216 and 16220. After sending SIGTERM (or SIGKILL, doesn't matter) to pgrp -16216, pid 16216 is reported as killed, but pid 16219 is still running. Another waitpid on the pgrp gives "[Errno 10] No child processes".
[16216, '(mock)', 'S', 16068, 16216, 2532, 34816, 16068, 4194560, 289, 0, 0, 0, 0, 0, 0, 0, 20, 0, 1, 0, 603929, 1642496, 48, 4294967295L, 134512640, 134516316, 3216708864L, 3216708516L, 1115138, 0, 0, 553652224, 0, 3225615909L, 0, 0, 17, 0, 0, 0, 0] [16219, '(python)', 'S', 16216, 16216, 2532, 34816, 16068, 4202752, 2067, 0, 0, 0, 41, 6, 0, 0, 20, 0, 1, 0, 603934, 13672448, 1764, 4294967295L, 134512640, 134514536, 3214952352L, 3214940384L, 1115138, 0, 0, 553652224, 8194, 0, 0, 0, 17, 0, 0, 0, 1]
[16220, '(python)', 'S', 16219, 16220, 2532, 34816, 16068, 4202560, 343, 0, 0, 0, 0, 0, 0, 0, 20, 0, 1, 0, 603993, 13676544, 1109, 4294967295L, 134512640, 134514536, 3214952352L, 3214940384L, 1115138, 0, 0, 553652224, 2, 0, 0, 0, 17, 0, 0, 0, 0] [16221, '(tar)', 'S', 16220, 16220, 2532, 34816, 16068, 4194560, 891, 0, 0, 0, 80, 546, 0, 0, 20, 0, 1, 0, 603993, 2551808, 281, 4294967295L, 134508544, 134772428, 3217980208L, 3217979524L, 1115138, 0, 0, 553652224, 0, 0, 0, 0, 17, 0, 0, 0, 739] [16222, '(gzip)', 'D', 16221, 16220, 2532, 34816, 16068, 4194304, 167, 0, 0, 0, 650, 174, 0, 0, 20, 0, 1, 0, 603996, 2162688, 117, 4294967295L, 134508544, 134566708, 3220637264L, 3220635252L, 1115138, 0, 0, 553652224, 8404995, 0, 0, 0, 17, 0, 0, 0, 25]
On Mon, 4 Feb 2008 00:50:33 +0100, Michael Schwendt wrote:
On Thu, 31 Jan 2008 15:30:19 -0500, Mike McLean wrote:
Does that work successfully in koji? Here, the mock Python processes survive the kill signal. Only the parent mock process terminates. The other ones keep building until the job succeeds or fails.
Koji is pretty aggressive about killing off task processes.
https://hosted.fedoraproject.org/koji/browser/builder/kojid#L863
Hmmm... but is it successful?
It finds more child process groups and kills them, but here it doesn't kill all of mock's processes either.
To take a closer look I've ported the code into Plague (on F8). Below is one snapshot of the /proc stats of the build processes. The code finds process groups 16216 and 16220. After sending SIGTERM (or SIGKILL, doesn't matter) to pgrp -16216, pid 16216 is reported as killed, but pid 16219 is still running. Another waitpid on the pgrp gives "[Errno 10] No child processes".
And the reason is that process 16219 gets a new parent as soon as its parent mock process 16216 is killed. When starting the builder from within the shell, that can be seen easily.
buildsys@lists.fedoraproject.org