Tonight I tried installing FC5 test3 on my fake nforce4 RAID box.
Anaconda and the install completed OK with LVM on top of my dmraid mirror.
However, after the install finishes when the system tries to boot the kernel/initramfs can't find any LVM PVs and can't activate VolGroup00 and then it panics.
I tried this twice, once with the XEN kernel and one without. Same results with the behavior difference of the XEN kernel auto-rebooting after it can't finding the LVM PVs.
When I boot into the rescue environment the dmraid is activated and LVM comes up no problem.
In the rescue environment I inspected the init script inside the initrd. It appears fine, maybe I've missed something. Here it is:
#!/bin/nash
mount -t proc /proc /proc setquiet echo Mounting proc filesystem echo Mounting sysfs filesystem mount -t sysfs /sys /sys echo Creating /dev mount -o mode=0755 -t tmpfs /dev /dev mkdir /dev/pts mount -t devpts -o gid=5,mode=620 /dev/pts /dev/pts mkdir /dev/shm mkdir /dev/mapper echo Creating initial device nodes mknod /dev/ram0 b 1 0 mknod /dev/ram1 b 1 1 mknod /dev/ram b 1 1 mknod /dev/null c 1 3 mknod /dev/zero c 1 5 mknod /dev/systty c 4 0 mknod /dev/tty c 5 0 mknod /dev/console c 5 1 mknod /dev/ptmx c 5 2 mknod /dev/rtc c 10 135 mknod /dev/tty0 c 4 0 mknod /dev/tty1 c 4 1 mknod /dev/tty2 c 4 2 mknod /dev/tty3 c 4 3 mknod /dev/tty4 c 4 4 mknod /dev/tty5 c 4 5 mknod /dev/tty6 c 4 6 mknod /dev/tty7 c 4 7 mknod /dev/tty8 c 4 8 mknod /dev/tty9 c 4 9 mknod /dev/tty10 c 4 10 mknod /dev/tty11 c 4 11 mknod /dev/tty12 c 4 12 mknod /dev/ttyS0 c 4 64 mknod /dev/ttyS1 c 4 65 mknod /dev/ttyS2 c 4 66 mknod /dev/ttyS3 c 4 67 echo Setting up hotplug. hotplug echo Creating block device nodes. mkblkdevs echo "Loading scsi_mod.ko module" insmod /lib/scsi_mod.ko echo "Loading sd_mod.ko module" insmod /lib/sd_mod.ko echo "Loading libata.ko module" insmod /lib/libata.ko echo "Loading sata_nv.ko module" insmod /lib/sata_nv.ko echo "Loading jbd.ko module" insmod /lib/jbd.ko echo "Loading ext3.ko module" insmod /lib/ext3.ko echo "Loading dm-mod.ko module" insmod /lib/dm-mod.ko echo "Loading dm-mirror.ko module" insmod /lib/dm-mirror.ko echo "Loading dm-zero.ko module" insmod /lib/dm-zero.ko echo "Loading dm-snapshot.ko module" insmod /lib/dm-snapshot.ko echo Making device-mapper control node mkdmnod mkblkdevs rmparts sda rmparts sdb dm create nvidia_hcddcidd 0 586114702 mirror core 2 64 nosync 2 8:16 0 8:0 0 dm partadd nvidia_hcddcidd echo Scanning logical volumes lvm vgscan --ignorelockingfailure echo Activating logical volumes lvm vgchange -ay --ignorelockingfailure VolGroup00 resume /dev/VolGroup00/LogVol01 echo Creating root device. mkrootdev -t ext3 -o defaults,ro /dev/VolGroup00/LogVol00 echo Mounting root filesystem. mount /sysroot echo Setting up other filesystems. setuproot echo Switching to new root and running init. switchroot
On Mon, 2006-02-20 at 22:31 -0700, Dax Kelson wrote:
mkdmnod mkblkdevs rmparts sda rmparts sdb dm create nvidia_hcddcidd 0 586114702 mirror core 2 64 nosync 2 8:16 0 8:0 0 dm partadd nvidia_hcddcidd echo Scanning logical volumes lvm vgscan --ignorelockingfailure echo Activating logical volumes lvm vgchange -ay --ignorelockingfailure VolGroup00 resume /dev/VolGroup00/LogVol01
Ok, so this does all look fine -- can you add some sleeps in here and see if you can copy down exactly what these output, and see which one actually fails?
If that fails we can build you an initrd by hand that has tools in it...
On Tue, 2006-02-21 at 10:36 -0500, Peter Jones wrote:
On Mon, 2006-02-20 at 22:31 -0700, Dax Kelson wrote:
mkdmnod mkblkdevs rmparts sda rmparts sdb dm create nvidia_hcddcidd 0 586114702 mirror core 2 64 nosync 2 8:16 0 8:0 0 dm partadd nvidia_hcddcidd echo Scanning logical volumes lvm vgscan --ignorelockingfailure echo Activating logical volumes lvm vgchange -ay --ignorelockingfailure VolGroup00 resume /dev/VolGroup00/LogVol01
Ok, so this does all look fine -- can you add some sleeps in here and see if you can copy down exactly what these output, and see which one actually fails?
Sure. When I get home tonight I'll do it.
Dax Kelson Guru Labs
On Tue, 2006-02-21 at 10:36 -0500, Peter Jones wrote:
On Mon, 2006-02-20 at 22:31 -0700, Dax Kelson wrote:
mkdmnod mkblkdevs rmparts sda rmparts sdb dm create nvidia_hcddcidd 0 586114702 mirror core 2 64 nosync 2 8:16 0 8:0 0 dm partadd nvidia_hcddcidd echo Scanning logical volumes lvm vgscan --ignorelockingfailure echo Activating logical volumes lvm vgchange -ay --ignorelockingfailure VolGroup00 resume /dev/VolGroup00/LogVol01
Ok, so this does all look fine -- can you add some sleeps in here and see if you can copy down exactly what these output, and see which one actually fails?
If that fails we can build you an initrd by hand that has tools in it...
Peter
I added echos such as "about to dm create" and then some "sleep 5" after each of those commands.
There is zero output from mkdmnod on down until the "lvm vgscan" runs.
It produces this output:
device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com Reading all physical volumes. This may take a while... No volume groups found Unable to find volume group "VolGroup00" ...
Booting into the rescue environment the dm raid is brought up and LVM activated automatically and correctly.
Incidentally in the rescue environment I chrooted into my rootfilesytem and brought up my network interface (/etc/init.d/network start), and ran yum -y update.
There were about 40 packages downloaded, but every rpm install attempt puked out out with errors from the preinstall scripts. Unsurprisingly running rpm -Uvh /path/to/yum/cache/kernel*rpm resulted in the same error. :(
Dax Kelson
On Tue, 2006-02-21 at 22:55 -0700, Dax Kelson wrote:
I added echos such as "about to dm create" and then some "sleep 5" after each of those commands.
There is zero output from mkdmnod on down until the "lvm vgscan" runs.
Well, that means nothing thinks it's not working. Not an encouraging sign :/
It produces this output:
device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com Reading all physical volumes. This may take a while... No volume groups found Unable to find volume group "VolGroup00" ...
Booting into the rescue environment the dm raid is brought up and LVM activated automatically and correctly.
Hrm. If you run "dmsetup table" from this environment, does the output match the "dm create" line in the initrd?
It's almost as if lvm isn't checking the dm volumes, but that shouldn't be the case with even remotely recent lvm2.
Incidentally in the rescue environment I chrooted into my rootfilesytem and brought up my network interface (/etc/init.d/network start), and ran yum -y update.
There were about 40 packages downloaded, but every rpm install attempt puked out out with errors from the preinstall scripts. Unsurprisingly running rpm -Uvh /path/to/yum/cache/kernel*rpm resulted in the same error. :(
This could be related, but my gut reaction says it's not caused by your raid problems. Obviously it's still bad.
On Wed, 2006-02-22 at 09:18 -0500, Peter Jones wrote:
On Tue, 2006-02-21 at 22:55 -0700, Dax Kelson wrote:
I added echos such as "about to dm create" and then some "sleep 5" after each of those commands.
There is zero output from mkdmnod on down until the "lvm vgscan" runs.
Well, that means nothing thinks it's not working. Not an encouraging sign :/
It used to work when I installed rawhide last month.
I guess there is no verbose mode?
It produces this output:
device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com Reading all physical volumes. This may take a while... No volume groups found Unable to find volume group "VolGroup00" ...
Booting into the rescue environment the dm raid is brought up and LVM activated automatically and correctly.
Hrm. If you run "dmsetup table" from this environment, does the output match the "dm create" line in the initrd?
It's almost as if lvm isn't checking the dm volumes, but that shouldn't be the case with even remotely recent lvm2.
It does match. Here is the output from dmsetup table inside the rescue environment.
nvidia_hcddciddp1: 0 409368267 linear 253:0 241038 nvidia_hcddcidd: 0 586114702 mirror core 2 64 nosync 2 8:16 0 8:0 0 VolGroup00-LogVol01: 0 4063232 linear 253:3 83952000 VolGroup00-LogVol00: 0 83951616 linear 253:3 384 nvidia_hcddciddp3: 0 176490090 linear 253:0 409609305 nvidia_hcddciddp2: 0 208782 linear 253:0 63
As a reference here is what is in the initramfs init file:
dm create nvidia_hcddcidd 0 586114702 mirror core 2 64 nosync 2 8:16 0 8:0 0 dm partadd nvidia_hcddcidd
Incidentally in the rescue environment I chrooted into my rootfilesytem and brought up my network interface (/etc/init.d/network start), and ran yum -y update.
There were about 40 packages downloaded, but every rpm install attempt puked out out with errors from the preinstall scripts. Unsurprisingly running rpm -Uvh /path/to/yum/cache/kernel*rpm resulted in the same error. :(
This could be related, but my gut reaction says it's not caused by your raid problems. Obviously it's still bad.
Indeed. And it looks like Jermey Katz just fixed that.
Now if I can get control-c working and ssh/scp able to grab terminal in the rescue environment my complaints with it will be gone.
Dax Kelson Guru Labs
On Wed, 2006-02-22 at 09:18 -0500, Peter Jones wrote:
It's almost as if lvm isn't checking the dm volumes, but that shouldn't be the case with even remotely recent lvm2.
For kicks I tried an initrd with the "rmparts" commands commented out.
When I booted that I did get the expected "duplicate PV found selecting foo" messages. I rebooted before any writes could happen (I think).
Dax Kelson Guru Labs
On Wed, 2006-02-22 at 09:49 -0700, Dax Kelson wrote:
On Wed, 2006-02-22 at 09:18 -0500, Peter Jones wrote:
It's almost as if lvm isn't checking the dm volumes, but that shouldn't be the case with even remotely recent lvm2.
For kicks I tried an initrd with the "rmparts" commands commented out.
When I booted that I did get the expected "duplicate PV found selecting foo" messages. I rebooted before any writes could happen (I think).
Hrm. OK, that means it got farther. So try booting into the rescue environment and adding those drives into the lvm filters, and I'll try to figure out how to reproduce the failure.
On Wed, 2006-02-22 at 14:42 -0500, Peter Jones wrote:
Hrm. OK, that means it got farther. So try booting into the rescue environment and adding those drives into the lvm filters, and I'll try to figure out how to reproduce the failure.
Actually, I think I see what's going wrong, and this won't help at all.
I'm very close to actually wanting to install FC5 on a brank-spanking-new beast of a machine, but the disk configuration of it is identical to what the initial reporter has (nvidia raid0), should I wait for a while to see a fix for these problems in the rawhide changelog, or would it be a long wait? :-)
-- Chris
-----Original Message----- From: fedora-devel-list-bounces@redhat.com [mailto:fedora-devel-list-bounces@redhat.com] On Behalf Of Peter Jones Sent: Wednesday, February 22, 2006 22:39 To: Dax Kelson Cc: Development discussions related to Fedora Core Subject: Re: FC5 test3 -- dmraid broken?
On Wed, 2006-02-22 at 14:42 -0500, Peter Jones wrote:
Hrm. OK, that means it got farther. So try booting into the rescue environment and adding those drives into the lvm filters, and I'll try to figure out how to reproduce the failure.
Actually, I think I see what's going wrong, and this won't help at all.
On Sat, 2006-02-25 at 10:34 +0100, Chris Chabot wrote:
I'm very close to actually wanting to install FC5 on a brank-spanking-new beast of a machine, but the disk configuration of it is identical to what the initial reporter has (nvidia raid0), should I wait for a while to see a fix for these problems in the rawhide changelog, or would it be a long wait? :-)
Tomorrow's rawhide might be worth trying ;)
On Tue, 2006-02-21 at 22:55 -0700, Dax Kelson wrote:
Incidentally in the rescue environment I chrooted into my rootfilesytem and brought up my network interface (/etc/init.d/network start), and ran yum -y update.
There were about 40 packages downloaded, but every rpm install attempt puked out out with errors from the preinstall scripts. Unsurprisingly running rpm -Uvh /path/to/yum/cache/kernel*rpm resulted in the same error. :(
This is because we haven't been mounting /selinux in the chroot. I thought someone was going to file that, but apparently it didn't happen. Went ahead and fixed in CVS
Jeremy
devel@lists.stg.fedoraproject.org