Some old mails that should have been sent to the list
-------- Original Message --------
Subject: Re: /bpfs
Date: Wed, 1 Feb 2012 11:04:57 -0800
From: John Hawkes <jhawkes(a)penguincomputing.com>
To: Adam Young <ayoung(a)redhat.com>
I figured it out. Yes, an explicit mount is required. It doesn't
look like this explicit mount occurs in RHEL 4&5, but ... I press
onward, and I'll figure that out later. So now I've got the /bpfs to
seemingly work correctly on the RHEL 6 slave.
Now I'll work on why my simple bproc_move() program is segfaulting.
It does execute in usermode -- it can perform a syscall(__NR_bproc) to
read the bproc version number, but a syslog() causes a segfault. If I
eliminate the syslog(), then a call to bproc_currnode() works. And if
I add a printf() at the end, it prints the return value from
bproc_currnode(). Alas, running this a 2nd time causes a segfault.
I'm trying to puzzle out why it works sometimes and fails other times.
John
On Wed, Feb 1, 2012 at 10:42 AM, Adam Young<ayoung(a)redhat.com> wrote:
> I really don't remember much about bpfs at all, except for the feeling that
> is was a mistake, and we should have been using sysfs instead. I was half
> convinced that we should roll it back to the way it was in the bproc code
> prior to moving to the 2.6 Linux Kernel, but thought it might be easier to
> move forward than to move back.
>
> As I recall, an explicit mount was required, but that is really faded in my
> memory.
>
>
>
> On 01/31/2012 09:12 PM, John Hawkes wrote:
>>
>> I change the /bpfs to behave a bit more like /proc, in that after the
>> register_filesystem(), the bpfs code also does a kern_mount_data(),
>> which does the mount, which gets into bpfs_get_sb() and then (because
>> the MS_KERNMOUNT flag is set) does the bpfs_fill_super(). On the
>> master node, /etc/init.d/beowulf does an explicit mount, which I think
>> gets into bpfs_get_sb() a 2nd time, but now the MS_KERNMOUNT flag is
>> off.
>>
>> On the slave, the the same MS_KERNMOUNT flag is seen by bpfs_get_sb(),
>> and the same bpfs_fill_super(), and no 2nd mount is seen (at least not
>> until the /etc/beowulf/fstab may do it, which would be unnecessary).
>>
>> But the readdir() still looks bad on the slave.
>>
>> On the master:
>>
>> readdir('/bpfs') ino:1 name:'.'
>> readdir('/bpfs') ino:1 name:'..'
>> readdir('/bpfs') ino:8 name:'self'
>> readdir('/bpfs') ino:13 name:'-1'
>> readdir('/bpfs') ino:11 name:'status'
>>
>> which is what I expect.
>> But on the slave (after I've registered various nodes):
>>
>> readdir('/bpfs') ino:4018 name:.
>> readdir('/bpfs') ino:3982 name:..
>> stat('/bpfs/0') fails:2(No such file or directory)
>> stat('/bpfs/-1') fails:2(No such file or directory)
>> stat('/bpfs/self') fails:2(No such file or directory)
>>
>> So I've still got a big problem to solve.
>>
>> john
>>
>>
>> On Tue, Jan 31, 2012 at 4:15 PM, John Hawkes
>> <jhawkes(a)penguincomputing.com> wrote:
>>>
>>> So I think what the slave problem is ... is that nothing does an
>>> actual mount of /bpfs. The /bpfs filesystem gets registered, but
>>> nothing triggers a call to get the superblock or (because of that) to
>>> fill the superblock.
>>>
>>> I see that the /proc filesystem calls kern_mount_data(). Perhaps the
>>> slave side needs to do that. On the master side, the
>>> /etc/init.d/beowulf script issues a mount command.
>>>
>>> john
>>>
>>> I rewrote much of the /bpfs code for RHEL 6.2, by the way, because
>>> there were lots of changes in the vfs layer.
>>>
>>> On Tue, Jan 31, 2012 at 3:53 PM, John Hawkes
>>> <jhawkes(a)penguincomputing.com> wrote:
>>>>
>>>> Apparently I have problems with my port of the /bpfs code, too.
>>>>
>>>> Things seem to be working pretty well for the master, but on the slave
>>>> the behavior is strange. I discovered this while trying to understand
>>>> why /bpfs/self wasn't being seen on the slave. I've got the
>>>> kmod-bproc bpfs code instrumented to be verbose, and I do see the
>>>> various /bpfs/ namespace entries being created, but they aren't being
>>>> seen by user-level code.
>>>>
>>>> So I added a few lines to bpmaster and to bpslave, at the appropriate
>>>> spots after the /bpfs was set up, to do:
>>>> opendir("/rootfs")
>>>> then readdir()
>>>> and these both succeed on master and slave (although I haven't looked
>>>> at what gets returned by readdir). However, on the master, I see a
>>>> call to bpfs_root_readdir(), as expected, but I do *not* see this on
>>>> the slave, even though both master and slave bpfs code use the same
>>>> struct and the same inode and file operations.
>>>>
>>>> As you might expect, after the readdir() I do:
>>>> stat("/bpfs/-1")
>>>> etc. on the master, and
>>>> stat("/bpfs/-1")
>>>> on the slave... the master sees a successful stat() of all the names
>>>> that I expect to be there, but on the slave the stat() calls fail.
>>>> And nothing in the bpfs code seems to be called. So I conclude that
>>>> the /bpfs on the master gets set up as I expect, but on the slave it
>>>> doesn't get set up correctly... even though I'm doing the same
>>>> operations on both for the /bpfs root and the superblock.
>>>>
>>>> Weird.
>>>>
>>>> I hope this isn't part of something subtle with the /rootfs being used
>>>> as the temporary root during the initial phases of the slave boot.
>>>>
>>>> John
>
>