Hi there,
FYI, I've experienced a stability issue with the jetson-tk1 NIC since kernel 5.3 and later. This is reported upstream at https://bugzilla.kernel.org/show_bug.cgi?id=206217 To sum-up: under some "MMC and network I/O load" (dnf update or scp of large file), the pciport receives AER errors that are actually fatal to the network interface and cannot be recovered unless a reboot.
I've bisected the issue and found the commit that once reverted, restore a good behaviour: https://patchwork.ozlabs.org/project/linux-tegra/patch/20200420164304.28810-... I haven't experienced any other regression since then.
What I would like to ask is: 1/ Is there any others reproducers for this issue on jetson-tk1 ? (issue only relevant on tegra124 SOC). 2/ As upstream agreed that a revert would be preferred until more investigation, can we consider to apply as a downstream patch until then ?
Thanks for any advices on the topic.
kernel@lists.fedoraproject.org