For some weeks I have a problem with XenU domain used as web server. It is turning into Zombie domain:
# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 641 4 r----- 263.6 Zombie-wwwmain 2 512 1 ----cd 5238.4 dns 4 128 1 -b---- 20.1 intranet 6 512 1 -b---- 18.7 ldap 5 128 1 -b---- 61.4 mail 3 512 1 r----- 1078.0 www 1 512 1 -b---- 117.7 wwwextra 7 512 1 -b---- 33.2
(Both Xen0 and XenU are running FC5 with kernel 2.6.17-1.2174.)
- How can I find out what is causing this problem? - How can I fix it / work around it / report a bug?
This is production machine so I would appriciate any pointers. I didn't have this problem before. Maybe new kernel is causing the problem? Is there a way to get older versions of kernel packages?
I would also like to note that when this happens networking doesn't work in ANY XenU domain. E.g.:
# xm console dns after login trying to ping Google IP: # ping 72.14.221.104 PING 72.14.221.104 (72.14.221.104) 56(84) bytes of data. ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ...
and when I run "init 0" in XenU (e.g. in dns XenU): # init 0 it hangs at this step: "Removing module iptables: "
When I try to shutdown XenU from Xen0 it also turns to Zombie. E.g. (dns XenU domain turns into Zombie-dns): # xm shutdown dns
What can I do?
____________________ http://www.email.si/
Gregor Pirnaver wrote:
For some weeks I have a problem with XenU domain used as web server. It is turning into Zombie domain:
# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 641 4 r----- 263.6 Zombie-wwwmain 2 512 1 ----cd 5238.4
Is that a fully virtualized domain, or paravirt ?
(Both Xen0 and XenU are running FC5 with kernel 2.6.17-1.2174.)
- How can I find out what is causing this problem?
- How can I fix it / work around it / report a bug?
If it is paravirt, you can "xm sysrq <domain> <key>" to get some debugging output.
If the domain in question is fully virt, do you have a stale qemu-dm hanging around?
Gregor Pirnaver wrote:
For some weeks I have a problem with XenU domain used as web server. It is turning into Zombie domain:
I observed the same problem and filed a bug report about it which is active:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=199944
Add yourself to the CC on that bug...
Gregor Pirnaver wrote:
For some weeks I have a problem with XenU domain used as web server. It is turning into Zombie domain:
I observed the same problem and filed a bug report about it which is active:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=199944
Add yourself to the CC on that bug...
This is the bug report I'd found... It looks like yours is the same issue:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=204468
Kwan Lowe wrote:
This is the bug report I'd found... It looks like yours is the same issue:
I've added a note there. If people can leave their consoles open and see if there is a kernel panic similar to the one that I reported, it may be the same xennet related bug.
I've observed the same problem on a number of machines I administer (Mixed Intel, AMD, etc), and each time it seems to come down to the same xennet issue.
On Thu, Sep 21, 2006, Russell McOrmond wrote:
Kwan Lowe wrote:
This is the bug report I'd found... It looks like yours is the same issue:
I've added a note there. If people can leave their consoles open and see if there is a kernel panic similar to the one that I reported, it may be the same xennet related bug.
I've observed the same problem on a number of machines I administer (Mixed Intel, AMD, etc), and each time it seems to come down to the same xennet issue.
Here's a recent crash. I can't run 2187 because it crashes almost straight away (and I think I emailed the crashdump to the list already.)
Its still worrying that this oops takes out networking for all VMs..
Adrian
BUG: unable to handle kernel NULL pointer dereference at virtual address 000000bc printing eip: c908e1ad *pde = ma 1509f067 pa 01ff3067 *pte = ma 00000000 pa fffff000 Oops: 0002 [#1] SMP Modules linked in: ipv6 xennet dm_snapshot dm_zero dm_mirror dm_mod raid1 CPU: 0 EIP: 0061:[<c908e1ad>] Not tainted VLI EFLAGS: 00210046 (2.6.17-1.2157_FC5xenU #1) EIP is at network_tx_buf_gc+0xc4/0x1b7 [xennet] eax: 00000066 ebx: 00000032 ecx: c6540cfc edx: 00000000 esi: 00000001 edi: c6540400 ebp: 0000002c esp: c0651edc ds: 007b es: 007b ss: 0069 Process swapper (pid: 0, threadinfo=c0650000 task=c05f1800) Stack: <0>c6540cfc 00000000 00000000 00000004 c6540000 001ece31 001ece32 001ece12 00000000 c6540488 c6540400 c6540000 c908f150 c01b5e80 00000000 00000000 00000107 c043a57d 00000107 c6540000 c0651f88 c0651f88 00000107 c0643780 Call Trace: <c908f150> netif_int+0x24/0x66 [xennet] <c043a57d> handle_IRQ_event+0x42/0x85 <c043a64d> __do_IRQ+0x8d/0xdc <c040665a> do_IRQ+0x1a/0x25 <c0519efd> evtchn_do_upcall+0x66/0x9f <c0404d79> hypervisor_callback+0x3d/0x48 <c0407a6a> safe_halt+0x84/0xa7 <c0402bde> xen_idle+0x46/0x4e <c0402cfd> cpu_idle+0x94/0xad <c0655772> start_kernel+0x346/0x34c Code: b4 9f 00 09 00 00 50 e8 9d c5 48 f7 c7 84 9f 00 09 00 00 00 00 00 00 8b 87 f4 00 00 00 89 84 9f f4 00 00 00 89 9f f4 00 00 00 90 <ff> 8d 90 00 00 00 0f 94 c0 83 c4 10 84 c0 74 62 bb 00 e0 ff ff EIP: [<c908e1ad>] network_tx_buf_gc+0xc4/0x1b7 [xennet] SS:ESP 0069:c0651edc <0>Kernel panic - not syncing: Fatal exception in interrupt
xen@lists.stg.fedoraproject.org