https://bugzilla.redhat.com/show_bug.cgi?id=843731
Always the same process (ocamlopt.opt) and always on 32 bit only.
The thing is, it *didn't* happen just 3 days ago. Nothing has changed in the package, and ocamlopt.opt is the same as 3 days ago.
glibc has had a few memory-related fixes in the past 3 days: - Revert patch for BZ696143, it made it impossible to use IPV6 addresses explicitly in getaddrinfo, which in turn broke ssh, apache and other code. (#808147) - Avoid another unbound alloca in vfprintf (#841318) - Remove /etc/localtime.tzupdate in lua scriptlets - Revert back to using posix.symlink as posix.link with a 3rd argument isn't supported in the lua version embedded in rpm. - Revert recent changes to res_send (804630, 835090). - Fix memcpy args in res_send (#841787).
Has something changed in Koji or mock such as inherited signal masks?
I even went as far as building a 32 bit Rawhide VM to test this, but I can't reproduce it there, and that's pretty odd considering it happens reliably in Koji.
Rich.
On Fri, 27 Jul 2012, Richard W.M. Jones wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=843731
Always the same process (ocamlopt.opt) and always on 32 bit only.
The thing is, it *didn't* happen just 3 days ago. Nothing has changed in the package, and ocamlopt.opt is the same as 3 days ago.
glibc has had a few memory-related fixes in the past 3 days:
- Revert patch for BZ696143, it made it impossible to use IPV6 addresses explicitly in getaddrinfo, which in turn broke ssh, apache and other code. (#808147)
- Avoid another unbound alloca in vfprintf (#841318)
- Remove /etc/localtime.tzupdate in lua scriptlets
- Revert back to using posix.symlink as posix.link with a 3rd argument isn't supported in the lua version embedded in rpm.
- Revert recent changes to res_send (804630, 835090).
- Fix memcpy args in res_send (#841787).
Has something changed in Koji or mock such as inherited signal masks?
I even went as far as building a 32 bit Rawhide VM to test this, but I can't reproduce it there, and that's pretty odd considering it happens reliably in Koji.
Do you happen to know how much memory that build consumes? Most of the new builders are 4GB instances(w/ 2GB of swap) - could you be hitting the top end of memory?
-sv
On Fri, Jul 27, 2012 at 01:16:32PM -0400, Seth Vidal wrote:
Do you happen to know how much memory that build consumes? Most of the new builders are 4GB instances(w/ 2GB of swap) - could you be hitting the top end of memory?
I'm not going to say no -- the binary might be buggy -- but it seems unlikely. The program that it's compiling is only 62 lines long and it doesn't take significant memory or time when I run it locally.
Rich.
On Fri, 27 Jul 2012, Richard W.M. Jones wrote:
On Fri, Jul 27, 2012 at 01:16:32PM -0400, Seth Vidal wrote:
Do you happen to know how much memory that build consumes? Most of the new builders are 4GB instances(w/ 2GB of swap) - could you be hitting the top end of memory?
I'm not going to say no -- the binary might be buggy -- but it seems unlikely. The program that it's compiling is only 62 lines long and it doesn't take significant memory or time when I run it locally.
so - I can test a build of your program on one of the builders with more memory and the same memory directly in mock if it helps. I'm pretty sure once the mock process is spawned the koji doesn't mess with it.
-sv
On Fri, Jul 27, 2012 at 11:08 AM, Richard W.M. Jones rjones@redhat.com wrote:
Always the same process (ocamlopt.opt) and always on 32 bit only.
The thing is, it *didn't* happen just 3 days ago. Nothing has changed in the package, and ocamlopt.opt is the same as 3 days ago.
Well .... you may recall that I haven't been able to build coq successfully for quite awhile. The symptom there is apparently random segfaults during the build, on 32-bit x86 only; I've never had the problem with an x86_64 build. Maybe this problem has been around longer than you think, but something just made it more likely. (Or I'm hitting a different problem with the same symptoms.)
On Fri, Jul 27, 2012 at 11:32:57AM -0600, Jerry James wrote:
On Fri, Jul 27, 2012 at 11:08 AM, Richard W.M. Jones rjones@redhat.com wrote:
Always the same process (ocamlopt.opt) and always on 32 bit only.
The thing is, it *didn't* happen just 3 days ago. Nothing has changed in the package, and ocamlopt.opt is the same as 3 days ago.
Well .... you may recall that I haven't been able to build coq successfully for quite awhile. The symptom there is apparently random segfaults during the build, on 32-bit x86 only; I've never had the problem with an x86_64 build. Maybe this problem has been around longer than you think, but something just made it more likely. (Or I'm hitting a different problem with the same symptoms.)
Yes, now I do recall that. Also, I seem to remember that it built OK (for me) locally, but not in Koji. That would be quite similar wouldn't it ...
Is there a BZ for the coq failure?
Rich.
On Fri, Jul 27, 2012 at 11:41 AM, Richard W.M. Jones rjones@redhat.com wrote:
Yes, now I do recall that. Also, I seem to remember that it built OK (for me) locally, but not in Koji. That would be quite similar wouldn't it ...
Is there a BZ for the coq failure?
Rich.
No, I never filed one. I've never been sure whether it was a coq problem, an ocaml problem, or something else. I also haven't tried building for awhile. I'm going to fire off a scratch build and see what happens.
On Fri, Jul 27, 2012 at 11:45 AM, Jerry James loganjerry@gmail.com wrote:
No, I never filed one. I've never been sure whether it was a coq problem, an ocaml problem, or something else. I also haven't tried building for awhile. I'm going to fire off a scratch build and see what happens.
Same as before; the binaries are created successfully, but coqdoc eventually segfaults on 32-bit x86 after some apparently random number of runs:
http://koji.fedoraproject.org/koji/taskinfo?taskID=4336975
The coq build may be a good one for tripping over whatever is wrong because of the sheer number of times coqdoc is invoked during the build.
On Fri, Jul 27, 2012 at 02:22:09PM -0600, Jerry James wrote:
On Fri, Jul 27, 2012 at 11:45 AM, Jerry James loganjerry@gmail.com wrote:
No, I never filed one. I've never been sure whether it was a coq problem, an ocaml problem, or something else. I also haven't tried building for awhile. I'm going to fire off a scratch build and see what happens.
Same as before; the binaries are created successfully, but coqdoc eventually segfaults on 32-bit x86 after some apparently random number of runs:
http://koji.fedoraproject.org/koji/taskinfo?taskID=4336975
The coq build may be a good one for tripping over whatever is wrong because of the sheer number of times coqdoc is invoked during the build.
Seth got me the IP addr where ocamlopt.opt was segfaulting, which is a good start:
https://bugzilla.redhat.com/show_bug.cgi?id=843731#c4
Rich.
On Fri, 27 Jul 2012, Richard W.M. Jones wrote:
On Fri, Jul 27, 2012 at 02:22:09PM -0600, Jerry James wrote:
On Fri, Jul 27, 2012 at 11:45 AM, Jerry James loganjerry@gmail.com wrote:
No, I never filed one. I've never been sure whether it was a coq problem, an ocaml problem, or something else. I also haven't tried building for awhile. I'm going to fire off a scratch build and see what happens.
Same as before; the binaries are created successfully, but coqdoc eventually segfaults on 32-bit x86 after some apparently random number of runs:
http://koji.fedoraproject.org/koji/taskinfo?taskID=4336975
The coq build may be a good one for tripping over whatever is wrong because of the sheer number of times coqdoc is invoked during the build.
Seth got me the IP addr where ocamlopt.opt was segfaulting, which is a good start:
If you can - test on your own system on an el6 vm.
Also - does this build need to connect to the external world at any point?
B/c our builders are.... ummm.... fairly well isolated.
-sv
On Fri, Jul 27, 2012 at 05:04:47PM -0400, Seth Vidal wrote:
On Fri, 27 Jul 2012, Richard W.M. Jones wrote:
On Fri, Jul 27, 2012 at 02:22:09PM -0600, Jerry James wrote:
On Fri, Jul 27, 2012 at 11:45 AM, Jerry James loganjerry@gmail.com wrote:
No, I never filed one. I've never been sure whether it was a coq problem, an ocaml problem, or something else. I also haven't tried building for awhile. I'm going to fire off a scratch build and see what happens.
Same as before; the binaries are created successfully, but coqdoc eventually segfaults on 32-bit x86 after some apparently random number of runs:
http://koji.fedoraproject.org/koji/taskinfo?taskID=4336975
The coq build may be a good one for tripping over whatever is wrong because of the sheer number of times coqdoc is invoked during the build.
Seth got me the IP addr where ocamlopt.opt was segfaulting, which is a good start:
If you can - test on your own system on an el6 vm.
Does Koji still use Xen, or is it now on qemu-kvm?
Also - does this build need to connect to the external world at any point?
Definitely not :-)
Rich.
On Fri, 27 Jul 2012, Richard W.M. Jones wrote:
If you can - test on your own system on an el6 vm.
Does Koji still use Xen, or is it now on qemu-kvm?
kvm - I don't remember us ever using xen - but I could be blocking it out of my memory :)
Also - does this build need to connect to the external world at any point?
Definitely not :-)
hey - just thought I'd ask :)
-sv
Coq has now been rebuilt .. although I suspect that's largely by luck. In any case you can go ahead and rebuild any dependencies that you wanted to.
Thanks,
Rich.
devel@lists.stg.fedoraproject.org