Somewhere after the 2.6.18 timeframe, Andi Kleen made the x86_64 e820 map __initdata:
struct e820map e820 __initdata;
This is fine for upstream x86_64 kernels, because the e820 map never gets used during runtime. But because we (RHEL/Fedora) have an x86_64 version of page_is_ram(), it ends up using __init data.
I became aware of this when I got an FC6/crash-utility bugzilla #237383, filed because the crash utility started failing on later FC6 live systems. This is because the kernel's /dev/crash module crash.ko started failing because it uses page_is_ram() as a memory access qualifer.
Also, because of Red Hat's restriction on the use of /dev/mem to only the first 256 RAM pages, the __initdata addition ends up opening the flood gates for /dev/mem usage. That's because of this:
/* * devmem_is_allowed() checks to see if /dev/mem access to a certain address is * valid. The argument is a physical page number. * * * On x86-64, access has to be given to the first megabyte of ram because that area * contains bios code and data regions used by X and dosemu and similar apps. * Access has to be given to non-kernel-ram areas as well, these contain the PCI * mmio resources as well as potential bios/acpi data regions. */ int devmem_is_allowed(unsigned long pagenr) { if (pagenr <= 256) return 1; if (!page_is_ram(pagenr)) return 1; return 0; }
The function is meant to allow /dev/mem accesses above pagenr 256 only if they are *not* RAM -- but since since page_is_ram() is failing, it inadvertantly allows access to any pagenr.
(Interestingly enough, this bug allows the user of the crash utility to work around the /dev/crash failure by alternatively using /dev/mem instead! /dev/crash was only created to begin with because of the /dev/mem restriction...
Anyway, reverting back from __initdata fixes the situation for both /dev/crash and the /dev/mem restriction.
Dave Anderson
(patch is against 2.6.20-1.2944)
--- linux-2.6.20.x86_64/arch/x86_64/kernel/e820.c.orig 2007-04-26 14:38:12.000000000 -0400 +++ linux-2.6.20.x86_64/arch/x86_64/kernel/e820.c 2007-04-26 14:38:24.000000000 -0400 @@ -25,7 +25,7 @@ #include <asm/bootsetup.h> #include <asm/sections.h>
-struct e820map e820 __initdata; +struct e820map e820;
/* * PFN of last memory page.
On Thu, 26 Apr 2007 15:40:00 -0400, Dave Anderson anderson@redhat.com wrote:
This is fine for upstream x86_64 kernels, because the e820 map never gets used during runtime. But because we (RHEL/Fedora) have an x86_64 version of page_is_ram(), it ends up using __init data.
+++ linux-2.6.20.x86_64/arch/x86_64/kernel/e820.c 2007-04-26 14:38:24.000000000 -0400 @@ -25,7 +25,7 @@ #include <asm/bootsetup.h> #include <asm/sections.h>
-struct e820map e820 __initdata; +struct e820map e820;
ACK
Although, do we follow the ACK process for stable Fedora now? I thought it was just the Chuck's judgement ever since he joined.
On Dave's side I looked at Rawhide just now and this does not seem to be there either (on 1.3126), although linux-2.6-devmem.patch and Xen are.
-- Pete
kernel@lists.fedoraproject.org