Hello,
I am trying to trace down a problem with a laptop that when it goes into suspend for any reason, the network won't come back up. Only a reboot will enable the wired network.
This problem started in February after a kernel update with Fedora 26. Upgraded to Fedora 27 today and the problem still persists. I was hoping it would be fixed.
The only indication of any issue is an error message that pops up. kernel: do_IRQ: 7.33 No irq handler for vector.
I would like to find more details but if I cannot I will just file a bug against the kernel.
Robin
On 03/25/2018 02:49 PM, Robin Laing wrote:
I am trying to trace down a problem with a laptop that when it goes into suspend for any reason, the network won't come back up. Only a reboot will enable the wired network.
Have you tried unloading and reloading the kernel module?
The only indication of any issue is an error message that pops up. kernel: do_IRQ: 7.33 No irq handler for vector.
These messages are usually benign.
I would like to find more details but if I cannot I will just file a bug against the kernel.
Have you checked the journal for the time around the resume? Note that the first chunk of messages at the resume time are actually from the end of the suspend before the resume.
It would also be useful to know the network chipset. "lspci -v" will tell you both the chipset and the kernel driver being used. After resume, try doing "modprobe -r <modulename>", then if that was successful, do "modprobe <modulename>" and see if that fixes it.
On 25/03/18 17:34, Samuel Sieb wrote:
On 03/25/2018 02:49 PM, Robin Laing wrote:
I am trying to trace down a problem with a laptop that when it goes into suspend for any reason, the network won't come back up. Only a reboot will enable the wired network.
Have you tried unloading and reloading the kernel module?
The only indication of any issue is an error message that pops up. kernel: do_IRQ: 7.33 No irq handler for vector.
These messages are usually benign.
I would like to find more details but if I cannot I will just file a bug against the kernel.
Have you checked the journal for the time around the resume? Note that the first chunk of messages at the resume time are actually from the end of the suspend before the resume.
It would also be useful to know the network chipset. "lspci -v" will tell you both the chipset and the kernel driver being used. After resume, try doing "modprobe -r <modulename>", then if that was successful, do "modprobe <modulename>" and see if that fixes it. _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org
I have looked through the journal logs before but still learning journalctl.
Looking through my notes, the problem seems to start around Feb 26.
Network controller is:
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
Module is:
Kernel modules: r8169
From journal The lid closed is detected and network manager shuts down the connection.
Network name is enp4s0
Start of suspend
Mar 25 21:31:44 xx NetworkManager[7949]: <info> [1522013504.0482] device (enp4s0): state change: activated -> deactivating (reason 'sleeping', internal state 'managed')
Mar 25 21:31:44 xx NetworkManager[7949]: <info> [1522013504.0920] device (enp4s0): state change: deactivating -> disconnected (reason 'sleeping', internal state 'managed') Mar 25 21:31:44 xx avahi-daemon[7862]: Withdrawing address record for 2001:56a:7680:b500:4216:7eff:fe10:e09a on enp4s0. Mar 25 21:31:44 xx NetworkManager[7949]: <info> [1522013504.0926] dhcp6 (enp4s0): canceled DHCP transaction Mar 25 21:31:44 xx avahi-daemon[7862]: Leaving mDNS multicast group on interface enp4s0.IPv6 with address 2001:56a:7680:b500:4216:7eff:fe10:e09a. Mar 25 21:31:44 xx avahi-daemon[7862]: Joining mDNS multicast group on interface enp4s0.IPv6 with address fe80::4216:7eff:fe10:e09a. Mar 25 21:31:44 xx avahi-daemon[7862]: Registering new address record for fe80::4216:7eff:fe10:e09a on enp4s0.*. Mar 25 21:31:44 xx avahi-daemon[7862]: Withdrawing address record for fe80::4216:7eff:fe10:e09a on enp4s0. Mar 25 21:31:44 xx avahi-daemon[7862]: Leaving mDNS multicast group on interface enp4s0.IPv6 with address fe80::4216:7eff:fe10:e09a. Mar 25 21:31:44 xx avahi-daemon[7862]: Interface enp4s0.IPv6 no longer relevant for mDNS. Mar 25 21:31:44 xx avahi-daemon[7862]: Withdrawing address record for 192.168.1.21 on enp4s0. Mar 25 21:31:44 xx avahi-daemon[7862]: Leaving mDNS multicast group on interface enp4s0.IPv4 with address 192.168.1.21. Mar 25 21:31:44 xx avahi-daemon[7862]: Interface enp4s0.IPv4 no longer relevant for mDNS. Mar 25 21:31:44 xx NetworkManager[7949]: <info> [1522013504.0950] device (enp4s0): state change: disconnected -> unmanaged (reason 'sleeping', internal state 'managed') Mar 25 21:31:44 xx nm-dispatcher[9588]: req:2 'down' [enp4s0]: new request (6 scripts) Mar 25 21:31:44 xx nm-dispatcher[9588]: req:2 'down' [enp4s0]: start running ordered scripts...
Start of open lid from suspend
Mar 25 21:33:45 xx NetworkManager[7949]: <info> [1522013625.9934] device (enp4s0): state change: unmanaged -> unavailable (reason 'managed', internal state 'managed') Mar 25 21:33:45 xx kernel: IPv6: ADDRCONF(NETDEV_UP): enp4s0: link is not ready Mar 25 21:33:46 xx kernel: r8169 0000:04:00.0 enp4s0: link down Mar 25 21:33:46 xx kernel: IPv6: ADDRCONF(NETDEV_UP): enp4s0: link is not ready
This laptop is using KDE and sddm. Is there a
Looking further through the log files at another suspend today I came across this.
Mar 26 01:07:54 xx kernel: r8169 0000:04:00.0 enp4s0: link down
Also, I find this but not sure if it is related.
Mar 26 01:07:54 xx audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Mar 26 01:07:55 xx ModemManager[1116]: <info> Couldn't check support for device at '/sys/devices/pci0000:00/0000:00:1c.2/0000:03:00.0': not supported by any plugin Mar 26 01:07:55 xx ModemManager[1116]: <info> Couldn't check support for device at '/sys/devices/pci0000:00/0000:00:1c.3/0000:04:00.0': not supported by any plugin Mar 26 01:07:55 xx kernel: do_IRQ: 7.33 No irq handler for vector
Looking further into the log files, I don't seen any mention of r1869 before March 17 when I tried to make a change to the boot parameters from something I found on the net which was almost a month after the problem started.
pci=nomsi,noaer
I will try the modprobe when I can.
Robin
On 03/25/2018 07:45 PM, Robin Laing wrote:
Network controller is:
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
Module is:
Kernel modules: r8169
That's the right driver. One thing you could try is after removing the module, try "modprobe r8169 debug=n" where n is a number up to 16. That will give you more debugging info in the log. Careful, 16 might really spam the log, so maybe start at 8 and work your way up.
Start of open lid from suspend
Mar 25 21:33:45 xx NetworkManager[7949]: <info> [1522013625.9934] device (enp4s0): state change: unmanaged -> unavailable (reason 'managed', internal state 'managed') Mar 25 21:33:45 xx kernel: IPv6: ADDRCONF(NETDEV_UP): enp4s0: link is not ready Mar 25 21:33:46 xx kernel: r8169 0000:04:00.0 enp4s0: link down Mar 25 21:33:46 xx kernel: IPv6: ADDRCONF(NETDEV_UP): enp4s0: link is not ready
The driver is saying that there is no link detected. Are the lights on? What does "ethtool enp4s0" tell you?
Mar 26 01:07:55 xx ModemManager[1116]: <info> Couldn't check support for device at '/sys/devices/pci0000:00/0000:00:1c.3/0000:04:00.0': not supported by any plugin
This one looks like your network card, but you don't want ModemManager doing anything with it anyway.
Looking further into the log files, I don't seen any mention of r1869 before March 17 when I tried to make a change to the boot parameters from something I found on the net which was almost a month after the problem started.
pci=nomsi,noaer
I would suggest removing this.
My guess, given that reloading the driver makes it work again, is that after resume, the driver is not turning some part of the chipset back on. Maybe the interrupts are getting turned back on.
Mar 26 01:07:55 xx kernel: do_IRQ: 7.33 No irq handler for vector
What does "grep r8169 /proc/interrupts" give you when the interface is working? Try it a couple of times and see how the numbers change. Then when it's not working try it again a few times and see if the numbers are still changing.
I have been busy and unable to look at this until today.
On 26/03/18 01:05, Samuel Sieb wrote:
On 03/25/2018 07:45 PM, Robin Laing wrote:
Network controller is:
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
Module is:
Kernel modules: r8169
That's the right driver. One thing you could try is after removing the module, try "modprobe r8169 debug=n" where n is a number up to 16. That will give you more debugging info in the log. Careful, 16 might really spam the log, so maybe start at 8 and work your way up.
Going to 16 didn't make any difference from 10.
Apr 13 14:03:41 tdllap kernel: r8169 0000:04:00.0: can't disable ASPM; OS doesn't have ASPM control Apr 13 14:03:41 tdllap kernel: r8169 0000:04:00.0 eth0: RTL8168g/8111g at 0x00000000ef2b4190, 40:16:7e:10:e0:9a, XID 0c000880 IRQ 34 Apr 13 14:03:41 tdllap kernel: r8169 0000:04:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko] Apr 13 14:03:41 tdllap kernel: r8169 0000:04:00.0 enp4s0: renamed from eth0 Apr 13 14:03:41 tdllap kernel: r8169 0000:04:00.0 enp4s0: link down Apr 13 14:03:44 tdllap kernel: r8169 0000:04:00.0 enp4s0: link up
not ready
The driver is saying that there is no link detected. Are the lights on? What does "ethtool enp4s0" tell you?
Link lights on switch come up when lid closed and opened without reloading the network driver.
ethtool shows Link detected: no which is interesting.
Looking further into the log files, I don't seen any mention of r1869 before March 17 when I tried to make a change to the boot parameters from something I found on the net which was almost a month after the problem started.
pci=nomsi,noaer
I would suggest removing this.
My guess, given that reloading the driver makes it work again, is that after resume, the driver is not turning some part of the chipset back on. Maybe the interrupts are getting turned back on.
Mar 26 01:07:55 xx kernel: do_IRQ: 7.33 No irq handler for vector
What does "grep r8169 /proc/interrupts" give you when the interface is working? Try it a couple of times and see how the numbers change. Then when it's not working try it again a few times and see if the numbers are still changing.
This is from /proc/interupts and doesn't change between suspends or disappear. It is there from boot until I remove the module.
34: 0 0 0 0 125 0 0 175 IR-PCI-MSI 2097152-edge enp4s0
It used to work until February but I don't know what update affected it as I wasn't told until a few kernel updates that there was an issue.
What I found that is when I load the module, lsmod give me this. r8169 94208 0 mii 16384 1 r8169
I am going to look more at the mii-tool and see if that has anything to do with it.
I did find another thread about kernel modules being broken in February and specifically mentioning r8169 module not reloading on suspend.
https://forum.manjaro.org/t/linux415-r8168-cant-connect-to-the-network-after...
https://forum.manjaro.org/t/kernel-update-broke-ethernet-driver-realtek-r816...
Robin
On 25/03/18 17:34, Samuel Sieb wrote:
On 03/25/2018 02:49 PM, Robin Laing wrote:
I am trying to trace down a problem with a laptop that when it goes into suspend for any reason, the network won't come back up. Only a reboot will enable the wired network.
Have you tried unloading and reloading the kernel module?
The only indication of any issue is an error message that pops up. kernel: do_IRQ: 7.33 No irq handler for vector.
These messages are usually benign.
I would like to find more details but if I cannot I will just file a bug against the kernel.
Have you checked the journal for the time around the resume? Note that the first chunk of messages at the resume time are actually from the end of the suspend before the resume.
It would also be useful to know the network chipset. "lspci -v" will tell you both the chipset and the kernel driver being used. After resume, try doing "modprobe -r <modulename>", then if that was successful, do "modprobe <modulename>" and see if that fixes it. _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org
Finally got to try the modprobe and it did restart the network.
sudo modprobe -r r8169 sudo modprobe r1869
So, what is my next step in finding out why this won't restart on suspend?
Robin
On 26.03.2018 08:40, Robin Laing wrote:
sudo modprobe -r r8169 sudo modprobe r1869
So, what is my next step in finding out why this won't restart on suspend?
you can place a script for automatically load/unload your network driver.
see
https://blog.christophersmart.com/2016/05/11/running-scripts-before-and-afte...
best regards Ulf
On 26/03/18 11:57, Ulf Volmer wrote:
On 26.03.2018 08:40, Robin Laing wrote:
sudo modprobe -r r8169 sudo modprobe r1869
So, what is my next step in finding out why this won't restart on suspend?
you can place a script for automatically load/unload your network driver.
see
https://blog.christophersmart.com/2016/05/11/running-scripts-before-and-afte...
best regards Ulf
This works.
Thanks.
This is the scrip I used.
#!/bin/sh if [ "${1}" == "pre" ]; then # Do the thing you want before suspend here, e.g.: # echo "we are suspending at $(date)..." > /tmp/systemd_suspend_test modprobe -r r8169 elif [ "${1}" == "post" ]; then # Do the thing you want after resume here, e.g.: # echo "...and we are back from $(date)" >> /tmp/systemd_suspend_test modprobe r8169 fi
Robin