On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout maxim@wzzrd.com wrote:
I'm getting kernel panics in a VM that functions as a hypervisor, the moment I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is annoying, of course, so I try to be a good citizen and file a bug.
For some reason though, I cannot get the core dumped. I get a core fine with sysrq, but not with this actual panic. I've followed [1] to set up kdump and crash, but everytime I trigger the crash and see my VM reboot, I see an empty /var/crash afterwards.
As was able to get the vmcore written to /var/crash on in a RHEL7 guest, I'm starting to suspect a bug, but I'm unsure.
Any pointers on how to debug this?
[1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes
Adding the Fedora kernel list.
Kdump isn't automatically tested in Fedora and while it can work, it can often be broken as well. There might be someone on the kernel list that is more familiar with the current state of kdump support in Fedora, or alternative methods for getting the kernel backtrace.
josh
(cc'ing Dae Young)
On Tue, Jan 16, 2018 at 07:41:36AM -0500, Josh Boyer wrote:
On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout maxim@wzzrd.com wrote:
I'm getting kernel panics in a VM that functions as a hypervisor, the moment I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is annoying, of course, so I try to be a good citizen and file a bug.
For some reason though, I cannot get the core dumped. I get a core fine with sysrq, but not with this actual panic. I've followed [1] to set up kdump and crash, but everytime I trigger the crash and see my VM reboot, I see an empty /var/crash afterwards.
As was able to get the vmcore written to /var/crash on in a RHEL7 guest, I'm starting to suspect a bug, but I'm unsure.
Any pointers on how to debug this?
[1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes
Adding the Fedora kernel list.
Kdump isn't automatically tested in Fedora and while it can work, it can often be broken as well. There might be someone on the kernel list that is more familiar with the current state of kdump support in Fedora, or alternative methods for getting the kernel backtrace.
josh _______________________________________________ kernel mailing list -- kernel@lists.fedoraproject.org To unsubscribe send an email to kernel-leave@lists.fedoraproject.org
Don, thanks for ccing me. On 01/16/18 at 07:47am, Don Zickus wrote:
(cc'ing Dae Young)
On Tue, Jan 16, 2018 at 07:41:36AM -0500, Josh Boyer wrote:
On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout maxim@wzzrd.com wrote:
I'm getting kernel panics in a VM that functions as a hypervisor, the moment I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is annoying, of course, so I try to be a good citizen and file a bug.
For some reason though, I cannot get the core dumped. I get a core fine with sysrq, but not with this actual panic. I've followed [1] to set up kdump and crash, but everytime I trigger the crash and see my VM reboot, I see an empty /var/crash afterwards.
As was able to get the vmcore written to /var/crash on in a RHEL7 guest, I'm starting to suspect a bug, but I'm unsure.
One thing need check is if kdump service started successfully before the crash, ie. check /sys/kernel/kexec_crash_loaded.
If use self-build kernel, you can check to use below patch for testing:
--- It is useful to print kdump kernel loaded status in dump_stack() especially when panic happens so that we can differenciate kdump kernel early hang and a normal panic in a bug report.
Signed-off-by: Dave Young dyoung@redhat.com --- kernel/printk/printk.c | 3 +++ 1 file changed, 3 insertions(+)
--- linux-x86.orig/kernel/printk/printk.c +++ linux-x86/kernel/printk/printk.c @@ -48,6 +48,7 @@ #include <linux/sched/clock.h> #include <linux/sched/debug.h> #include <linux/sched/task_stack.h> +#include <linux/kexec.h>
#include <linux/uaccess.h> #include <asm/sections.h> @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l if (dump_stack_arch_desc_str[0] != '\0') printk("%sHardware name: %s\n", log_lvl, dump_stack_arch_desc_str); + if (kexec_crash_loaded()) + printk("%skdump kernel loaded\n", log_lvl);
print_worker_info(log_lvl, current); }
Any pointers on how to debug this?
[1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes
Adding the Fedora kernel list.
Kdump isn't automatically tested in Fedora and while it can work, it can often be broken as well. There might be someone on the kernel list that is more familiar with the current state of kdump support in Fedora, or alternative methods for getting the kernel backtrace.
Yes, since Fedora kernel updates frequently, it is not a surprise that kdump does not work. But it is always good to report a bug against "kexec-tools" component or "kernel" -> "Kexec/kdump" Subcomponent.
josh _______________________________________________ kernel mailing list -- kernel@lists.fedoraproject.org To unsubscribe send an email to kernel-leave@lists.fedoraproject.org
Thanks Dave
On 01/17/18 at 09:31am, Dave Young wrote:
Don, thanks for ccing me. On 01/16/18 at 07:47am, Don Zickus wrote:
(cc'ing Dae Young)
On Tue, Jan 16, 2018 at 07:41:36AM -0500, Josh Boyer wrote:
On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout maxim@wzzrd.com wrote:
I'm getting kernel panics in a VM that functions as a hypervisor, the moment I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is annoying, of course, so I try to be a good citizen and file a bug.
For some reason though, I cannot get the core dumped. I get a core fine with sysrq, but not with this actual panic. I've followed [1] to set up kdump and crash, but everytime I trigger the crash and see my VM reboot, I see an empty /var/crash afterwards.
As was able to get the vmcore written to /var/crash on in a RHEL7 guest, I'm starting to suspect a bug, but I'm unsure.
One thing need check is if kdump service started successfully before the crash, ie. check /sys/kernel/kexec_crash_loaded.
If use self-build kernel, you can check to use below patch for testing:
It is useful to print kdump kernel loaded status in dump_stack() especially when panic happens so that we can differenciate kdump kernel early hang and a normal panic in a bug report.
Signed-off-by: Dave Young dyoung@redhat.com
kernel/printk/printk.c | 3 +++ 1 file changed, 3 insertions(+)
--- linux-x86.orig/kernel/printk/printk.c +++ linux-x86/kernel/printk/printk.c @@ -48,6 +48,7 @@ #include <linux/sched/clock.h> #include <linux/sched/debug.h> #include <linux/sched/task_stack.h> +#include <linux/kexec.h>
#include <linux/uaccess.h> #include <asm/sections.h> @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l if (dump_stack_arch_desc_str[0] != '\0') printk("%sHardware name: %s\n", log_lvl, dump_stack_arch_desc_str);
if (kexec_crash_loaded())
printk("%skdump kernel loaded\n", log_lvl);
print_worker_info(log_lvl, current);
}
Any pointers on how to debug this?
[1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes
Adding the Fedora kernel list.
Kdump isn't automatically tested in Fedora and while it can work, it can often be broken as well. There might be someone on the kernel list that is more familiar with the current state of kdump support in Fedora, or alternative methods for getting the kernel backtrace.
Yes, since Fedora kernel updates frequently, it is not a surprise that kdump does not work. But it is always good to report a bug against "kexec-tools" component or "kernel" -> "Kexec/kdump" Subcomponent.
Hmm, I noticed in bugzilla there is no such subcomponent for Fedora if so the kdump bugs can be routed to "kexec-tools" so that we can be aware about them.
josh _______________________________________________ kernel mailing list -- kernel@lists.fedoraproject.org To unsubscribe send an email to kernel-leave@lists.fedoraproject.org
Thanks Dave
kernel@lists.fedoraproject.org