If you are knowledgeable about UEFI, I'll welcome your advice. This is the issue I encountered:
1. I enabled UEFI mode in BIOS in Lenovo X220 (more exactly I set UEFI as the preferred method). 2. I installed Fedora 17. 3. "Fedora" item appeared in BIOS in "Boot order" and also in the boot manager if you hit F12 on device start-up. 4. The Lenovo X220 machine had a broken audio connector, so I received a replacement, exactly the same X220 machine (completely same hardware), just a different piece. 5. I enabled UEFI mode in BIOS in the new X220 machine. 6. I swapped the disk from the old X220 machine to the new X220 machine. 7. The new X220 machine pretended that the harddisk was not bootable. It behaved exactly same as if the disk was blank. When I selected to boot from HDD, it just skipped HDD and went to other boot methods (CD, network, etc). Of course there was no longer any "Fedora" item in BIOS "Boot order" or the boot manager on F12 key press. 8. I had no idea how to fix that, how to force the new machine to boot my Fedora, or how to "re-install" the UEFI item (e.g. similar to GRUB re-installation). I had to re-install the whole system.
My question obviously is: a) Is this a hardware bug, or are UEFI machines supposed to work this way? Is this the end of disk swapping between machines? b) Is it possible to re-install the UEFI item somehow, e.g. using a LiveCD?
Thanks, Kamil
On 06/28/2012 09:11 AM, Kamil Paral wrote:
If you are knowledgeable about UEFI, I'll welcome your advice. This is the issue I encountered:
- I enabled UEFI mode in BIOS in Lenovo X220 (more exactly I set UEFI as the preferred method).
- I installed Fedora 17.
- "Fedora" item appeared in BIOS in "Boot order" and also in the boot manager if you hit F12 on device start-up.
- The Lenovo X220 machine had a broken audio connector, so I received a replacement, exactly the same X220 machine (completely same hardware), just a different piece.
- I enabled UEFI mode in BIOS in the new X220 machine.
- I swapped the disk from the old X220 machine to the new X220 machine.
- The new X220 machine pretended that the harddisk was not bootable. It behaved exactly same as if the disk was blank. When I selected to boot from HDD, it just skipped HDD and went to other boot methods (CD, network, etc). Of course there was no longer any "Fedora" item in BIOS "Boot order" or the boot manager on F12 key press.
- I had no idea how to fix that, how to force the new machine to boot my Fedora, or how to "re-install" the UEFI item (e.g. similar to GRUB re-installation). I had to re-install the whole system.
My question obviously is: a) Is this a hardware bug, or are UEFI machines supposed to work this way? Is this the end of disk swapping between machines? b) Is it possible to re-install the UEFI item somehow, e.g. using a LiveCD?
This certainly appears that your newer x220 isn't set to boot in UEFI mode?
On 06/28/2012 09:25 AM, Peter Jones wrote:
On 06/28/2012 09:11 AM, Kamil Paral wrote:
If you are knowledgeable about UEFI, I'll welcome your advice. This is the issue I encountered:
- I enabled UEFI mode in BIOS in Lenovo X220 (more exactly I set UEFI as the
preferred method). 2. I installed Fedora 17. 3. "Fedora" item appeared in BIOS in "Boot order" and also in the boot manager if you hit F12 on device start-up. 4. The Lenovo X220 machine had a broken audio connector, so I received a replacement, exactly the same X220 machine (completely same hardware), just a different piece. 5. I enabled UEFI mode in BIOS in the new X220 machine. 6. I swapped the disk from the old X220 machine to the new X220 machine. 7. The new X220 machine pretended that the harddisk was not bootable. It behaved exactly same as if the disk was blank. When I selected to boot from HDD, it just skipped HDD and went to other boot methods (CD, network, etc). Of course there was no longer any "Fedora" item in BIOS "Boot order" or the boot manager on F12 key press. 8. I had no idea how to fix that, how to force the new machine to boot my Fedora, or how to "re-install" the UEFI item (e.g. similar to GRUB re-installation). I had to re-install the whole system.
My question obviously is: a) Is this a hardware bug, or are UEFI machines supposed to work this way? Is this the end of disk swapping between machines? b) Is it possible to re-install the UEFI item somehow, e.g. using a LiveCD?
This certainly appears that your newer x220 isn't set to boot in UEFI mode?
Having sent that mail it became obvious that what's happened is that your new x220 board doesn't have the efi boot variable set. Some machines allow you to boot from a file, in which case it'll be /efi/fedora/grubx64.efi . If your firmware doesn't have that, you'll need to boot some install/rescue media to get to a shell. In either case you'll need to use efibootmgr to add /efi/fedora/grubx64.efi to the boot order.
That's all assuming it's F17; if it's earlier, it'll be /efi/redhat/grub.efi .
On Thu, 28.06.12 09:29, Peter Jones (pjones@redhat.com) wrote:
Having sent that mail it became obvious that what's happened is that your new x220 board doesn't have the efi boot variable set. Some machines allow you to boot from a file, in which case it'll be /efi/fedora/grubx64.efi . If your firmware doesn't have that, you'll need to boot some install/rescue media to get to a shell. In either case you'll need to use efibootmgr to add /efi/fedora/grubx64.efi to the boot order.
That's all assuming it's F17; if it's earlier, it'll be /efi/redhat/grub.efi .
Hmm, so if grub would also install itself into /efi/boot/bootx64.efi then this problem would just go away as that is the default file that the EFI bios will execute. This would enable disk images that just boot without any need to register them in the bios...
Is there any reason why Fedora doesn't create that file?
(it's a pity FAT can't do symlinks, hence it should just be a copcy of grubx64.efi)
Lennart
On 06/28/2012 09:40 AM, Lennart Poettering wrote:
On Thu, 28.06.12 09:29, Peter Jones (pjones@redhat.com) wrote:
Having sent that mail it became obvious that what's happened is that your new x220 board doesn't have the efi boot variable set. Some machines allow you to boot from a file, in which case it'll be /efi/fedora/grubx64.efi . If your firmware doesn't have that, you'll need to boot some install/rescue media to get to a shell. In either case you'll need to use efibootmgr to add /efi/fedora/grubx64.efi to the boot order.
That's all assuming it's F17; if it's earlier, it'll be /efi/redhat/grub.efi .
Hmm, so if grub would also install itself into /efi/boot/bootx64.efi then this problem would just go away as that is the default file that the EFI bios will execute. This would enable disk images that just boot without any need to register them in the bios...
Is there any reason why Fedora doesn't create that file?
(it's a pity FAT can't do symlinks, hence it should just be a copcy of grubx64.efi)
You're not wrong, we just haven't solved this right yet. Using /efi/boot/bootx64.efi on non-removable media was an addition to the spec in 2.3.1 , which came out right /before/ we joined the USWG, and it isn't what we'd really like to be there. Among other problems, obviously if you're dual booting then each OS is just going to clobber theirs on top of the other one, so whichever you install first doesn't get to play in a failure scenario.
We haven't simply switched to using grub for that, because we don't really want the normal bootloader there as the "boot file of last resort". The idea is to have that file look for your normal bootloader and re-add Boot#### entries automatically if it gets run, and then have it exec your real bootloader. I have the beginning of some code to do this, and it'll probably go in shim. We're also going to propose a best-practices at USWG for more standardized discovery in this situation, so we can do something more standard across OSes without worrying about clobbering this file as we do now.
We could put grubx64.efi there as a stop-gap, and if we don't have what I've mentioned above ready for F18, we probably will.
On Thu, Jun 28, 2012 at 03:40:17PM +0200, Lennart Poettering wrote:
Hmm, so if grub would also install itself into /efi/boot/bootx64.efi then this problem would just go away as that is the default file that the EFI bios will execute. This would enable disk images that just boot without any need to register them in the bios...
Is there any reason why Fedora doesn't create that file?
Yes - it's not intended to be a bootloader, it's intended to be something that handles setting up boot options again. We (a) haven't written it, and (b) need to find a way to co-exist with other operating systems that write the same file.
Having sent that mail it became obvious that what's happened is that your new x220 board doesn't have the efi boot variable set. Some machines allow you to boot from a file, in which case it'll be /efi/fedora/grubx64.efi . If your firmware doesn't have that, you'll need to boot some install/rescue media to get to a shell. In either case you'll need to use efibootmgr to add /efi/fedora/grubx64.efi to the boot order.
That's all assuming it's F17; if it's earlier, it'll be /efi/redhat/grub.efi .
Efibootmgr revealed following:
$ efibootmgr -v ... Boot0019* Fedora HD(1,800,64000,16a05b56-2ea8-4cea-956b-f2d5499583e5)File(\EFI\redhat\grub.efi)
(It's F17 clean install, but it has /grub.efi file, instead of /grubx64.efi. I installed from USB.)
That means that if I can re-generate the same boot option on the new hardware, it should boot, right? That's great. I can't reproduce it easily again (the other X220 is gone now), but it's useful to know this in case I need it again. Thanks for the explanation.
Do we have a Fedora page documenting boot problems somewhere (re-installing GRUB and stuff)? It would be useful to add a short help in there about UEFI too. GRUB guides are all over the Internet, but UEFI is a new stuff and I wasn't able to google anything at all about this problem.
On 06/28/2012 10:08 AM, Kamil Paral wrote:
Having sent that mail it became obvious that what's happened is that your new x220 board doesn't have the efi boot variable set. Some machines allow you to boot from a file, in which case it'll be /efi/fedora/grubx64.efi . If your firmware doesn't have that, you'll need to boot some install/rescue media to get to a shell. In either case you'll need to use efibootmgr to add /efi/fedora/grubx64.efi to the boot order.
That's all assuming it's F17; if it's earlier, it'll be /efi/redhat/grub.efi .
Efibootmgr revealed following:
$ efibootmgr -v ... Boot0019* Fedora HD(1,800,64000,16a05b56-2ea8-4cea-956b-f2d5499583e5)File(\EFI\redhat\grub.efi)
(It's F17 clean install, but it has /grub.efi file, instead of /grubx64.efi. I installed from USB.)
Er, yes, I've already lost what happened before the F18 tree from my mind :/. Sorry for the confusion.
That means that if I can re-generate the same boot option on the new hardware,it should boot, right? That's great. I can't reproduce it easily again (the other X220 is gone now), but it's useful to know this in case I need it again. Thanks for the explanation.
Well, the HD(...) may be slightly different, but I don't think it will be. If you've got everything mounted and you're in the chroot then you should be able to do:
efibootmgr -c -b ${SOMEFREEBOOTNUM} -L Fedora -l '\EFI\redhat\grub.efi'
Do we have a Fedora page documenting boot problems somewhere (re-installing GRUB and stuff)? It would be useful to add a short help in there about UEFI too. GRUB guides are all over the Internet, but UEFI is a new stuff and I wasn't able to google anything at all about this problem.
Common Bugs is kinda sorta close to what you're talking about I guess?
On Jun 28, 2012, at 8:08 AM, Kamil Paral wrote:
Do we have a Fedora page documenting boot problems somewhere (re-installing GRUB and stuff)? It would be useful to add a short help in there about UEFI too. GRUB guides are all over the Internet, but UEFI is a new stuff and I wasn't able to google anything at all about this problem.
On Linux, there are, in effect, four GRUBs. GRUB legacy, GRUB legacy EFI, GRUB 2, and GRUB 2 EFI. And all four are fundamentally different in the context of your question. I think any boot troubleshooting guide needs to be specific to each of these use cases, rather than being documented together. Documented together would be an example of user hostile documentation, not just a user hostile boot loading experience, because anyone needing a troubleshooting guide doesn't need to wade through 3/4 of the material that's inapplicable to their case.
Since GRUB legacy EFI is a Red Hat thing, not available upstream, but is what's being used in F17, it's unlikely you're going to find much useful information online about it. I certainly haven't. My (still limited) understanding of its idiosyncrasies comes from pain and misery due to direct contact. GRUB2 EFI has the potential to be an even more painful experience due to idiosyncrasies with various UEFI implementations.
I am speaking from a troubleshooting perspective, i.e., once boot loading has gone off the rails, it is a truly obnoxious experience from what I'm used to in the Apple hardware world. There, the fallback just works, doesn't matter what the OS is, and it requires no UI.
Chris Murphy
On Jun 28, 2012, at 7:29 AM, Peter Jones wrote:
Having sent that mail it became obvious that what's happened is that your new x220 board doesn't have the efi boot variable set. Some machines allow you to boot from a file, in which case it'll be /efi/fedora/grubx64.efi . If your firmware doesn't have that, you'll need to boot some install/rescue media to get to a shell.
It is perturbing that in 2012, with a nearly 30MB operating system as a pre-boot environment, that by design it doesn't scan the EFI System partition for other possible boot options - like a rescue mode - in the event efi boot variables aren't set.
Apple hardware does just such a scan. If I blow away every bit of information in NVRAM, the firmware still scans available disks, and chooses a reasonable default as fallback. Even in the case when Apple's bootloader isn't present.
So after all of this UEFI complexity and baggage, we still need rescue media in the example case? That is unbelievably stupid. The Lenovo case is either a bug or it's bad design or they enjoy creating user hostile hardware.
Chris Murphy
On 06/28/2012 12:17 PM, Chris Murphy wrote:
It is perturbing that in 2012, with a nearly 30MB operating system as a pre-boot environment, that by design it doesn't scan the EFI System partition for other possible boot options - like a rescue mode - in the event efi boot variables aren't set.
Well, as a matter of fact, if you read upthread, as of 2.3.1 it launches /boot/efi/boot${ARCH}.efi in this case. We weren't prepared for it, and so we're a little behind, but we've got a plan and we're going to do something about it.
Apple hardware does just such a scan. If I blow away every bit of information in NVRAM, the firmware still scans available disks, and chooses a reasonable default as fallback. Even in the case when Apple's bootloader isn't present.
I bet you their reasonable default doesn't seem as good if you're normally dual-booting and using grub to chain-load apple's loader. I bet it's 50/50 based on some criteria we haven't tried to figure out.
So after all of this UEFI complexity and baggage, we still need rescue media in the example case? That is unbelievably stupid. The Lenovo case is either a bug or it's bad design or they enjoy creating user hostile hardware.
As lennart, myself, and mjg59 all made perfectly clear - this is our bug; it's possible to do this according to spec (though it could be better), and we're just not doing it yet.
On Jun 28, 2012, at 10:26 AM, Peter Jones wrote:
On 06/28/2012 12:17 PM, Chris Murphy wrote:
It is perturbing that in 2012, with a nearly 30MB operating system as a pre-boot environment, that by design it doesn't scan the EFI System partition for other possible boot options - like a rescue mode - in the event efi boot variables aren't set.
Well, as a matter of fact, if you read upthread, as of 2.3.1 it launches /boot/efi/boot${ARCH}.efi in this case. We weren't prepared for it, and so we're a little behind, but we've got a plan and we're going to do something about it.
I'm confused. I'm not familiar with that location. In 12.3.1.3 the location EFI//EFI/BOOT/BOOT[machine type short name].EFI is optional. It is only required, with no other, for removable media. If not required, I don't see how you can be faulted for lack of preparation for something optional.
Apple hardware does just such a scan. If I blow away every bit of information in NVRAM, the firmware still scans available disks, and chooses a reasonable default as fallback. Even in the case when Apple's bootloader isn't present.
I bet you their reasonable default doesn't seem as good if you're normally dual-booting and using grub to chain-load apple's loader. I bet it's 50/50 based on some criteria we haven't tried to figure out.
In all of my testing, an empty NVRAM will always locate Apple's bootloader and use it first. If not present, then it goes to EFI//EFI/BOOT/BOOTx64.EFI. If not present, then it executes the first 440 bytes of the MBR (if a partition other than MBR 1 is marked bootable). Lacking a UI entirely, I think these are rather good fallbacks considering the target market. [1]
Now, it may very well be that absent all of those, yet with a efi//efi/redhat/grub.efi present, that Apple hardware would not locate this and use it, even if it were the only obvious choice. I haven't tested it. If that doesn't work, I'd probably criticize it.
So after all of this UEFI complexity and baggage, we still need rescue media in the example case? That is unbelievably stupid. The Lenovo case is either a bug or it's bad design or they enjoy creating user hostile hardware.
As lennart, myself, and mjg59 all made perfectly clear - this is our bug; it's possible to do this according to spec (though it could be better), and we're just not doing it yet.
I think we're talking about different things.
Based on section 3.3 of the 2.3.1 spec which rather clearly says a default boot behavior is required, though undefined (vendor specific), but must be invoked anytime the BootOrder variable is not present or invalid. The point being the expectation that default boot will load an operating system or a maintenance utility.
This is a firmware requirement, not a boot loader or operating system requirement.
I don't know what UEFI version Lenovo purports to conform to, but the lack of an EFI//EFI/BOOT/BOOTx64.EFI image isn't an excuse for it failing to boot a previously bootable disk that is in no way malformed. This seems to be a case of bad firmware behavior that isn't conforming to section 3.3 of the spec.
Chris Murphy
On Jun 28, 2012, at 12:04 PM, Chris Murphy wrote:
Lacking a UI entirely, I think these are rather good fallbacks considering the target market. [1]
[1] The possible exception is the UI-less, optionless, no way to prevent, the activation of CSM-BIOS booting in the case there's MBR boot code and a bootable MBR partition set. As far as I'm aware, non-Apple hardware makes this a discrete user selection, rather than automatically determined by firmware. Seems like a possible attack vector.
Chris Murphy
On 06/28/2012 02:04 PM, Chris Murphy wrote:
On Jun 28, 2012, at 10:26 AM, Peter Jones wrote:
On 06/28/2012 12:17 PM, Chris Murphy wrote:
It is perturbing that in 2012, with a nearly 30MB operating system as a pre-boot environment, that by design it doesn't scan the EFI System partition for other possible boot options - like a rescue mode - in the event efi boot variables aren't set.
Well, as a matter of fact, if you read upthread, as of 2.3.1 it launches /boot/efi/boot${ARCH}.efi in this case. We weren't prepared for it, and so we're a little behind, but we've got a plan and we're going to do something about it.
I'm confused. I'm not familiar with that location. In 12.3.1.3 the location EFI//EFI/BOOT/BOOT[machine type short name].EFI is optional. It is only required, with no other, for removable media. If not required, I don't see how you can be faulted for lack of preparation for something optional.
We're talking about 3.4.1.2 here.
In all of my testing, an empty NVRAM will always locate Apple's bootloader and use it first. If not present, then it goes to EFI//EFI/BOOT/BOOTx64.EFI. If not present, then it executes the first 440 bytes of the MBR (if a partition other than MBR 1 is marked bootable). Lacking a UI entirely, I think these are rather good fallbacks considering the target market. [1]
So what you're saying is that it really doesn't work that well unless you're booting MacOS first. Not surprising.
So after all of this UEFI complexity and baggage, we still need rescue media in the example case? That is unbelievably stupid. The Lenovo case is either a bug or it's bad design or they enjoy creating user hostile hardware.
As lennart, myself, and mjg59 all made perfectly clear - this is our bug; it's possible to do this according to spec (though it could be better), and we're just not doing it yet.
I think we're talking about different things.
Based on section 3.3 of the 2.3.1 spec which rather clearly says a default boot behavior is required, though undefined (vendor specific), but must be invoked anytime the BootOrder variable is not present or invalid. The point being the expectation that default boot will load an operating system or a maintenance utility.
You've completely ignored all of section 3.4, which specifies what those default boot parameters are for various kinds of devices. In version 2.2, there's no default for non-removable media whatsoever. The spec does sort of accidentally define that the vendor must have some default, but it really is allowed to be "set the machine on fire". That's what we're talking about.
In 2.3, section 3.4 has subclause 3.4.1.2 regarding default boot policy for non-removeable media. In effect, the policy is that you should put a binary such as I described earlier in the thread in /BOOT/EFI/BOOT${ARCH}.EFI on non-removeable media as well.
This is a firmware requirement, not a boot loader or operating system requirement.
It's a tool the OS can use. So far, we have not done so.
I don't know what UEFI version Lenovo purports to conform to, but the lack of an EFI//EFI/BOOT/BOOTx64.EFI image isn't an excuse for it failing to boot a previously bootable disk that is in no way malformed. This seems to be a case of bad firmware behavior that isn't conforming to section 3.3 of the spec.
I see no reason it isn't conforming to the current version of the spec. In fact, I don't see any reason it isn't conforming to /any/ version of the spec. The default behavior prior to 2.3 was to iterate all removable devices and do what's specified there, and then if that fails, iterate all "fixed media" devices and do something completely unspecified.
If we don't put a file there, the firmware is /in no way/ required to do anything in particular. It's *never* required to default to running UEFI applications not specified by Boot#### variables that are included in BootOrder and also do not match the path /BOOT/EFI/BOOT${ARCH}.EFI .
On Jun 28, 2012, at 12:29 PM, Peter Jones wrote:
In all of my testing, an empty NVRAM will always locate Apple's bootloader and use it first. If not present, then it goes to EFI//EFI/BOOT/BOOTx64.EFI. If not present, then it executes the first 440 bytes of the MBR (if a partition other than MBR 1 is marked bootable). Lacking a UI entirely, I think these are rather good fallbacks considering the target market. [1]
So what you're saying is that it really doesn't work that well unless you're booting MacOS first. Not surprising.
I've said this how?
It is completely reasonable and rational for Apple hardware to first boot Mac OS, if present, if NVRAM is empty or invalid. It's also consistent with section 3.3 of the UEFI spec. Vendors gets to decide the boot order. The point of that section is to get a bootable computer, rather than a computer that craps its diaper.
In 2.3, section 3.4 has subclause 3.4.1.2 regarding default boot policy for non-removeable media. In effect, the policy is that you should put a binary such as I described earlier in the thread in /BOOT/EFI/BOOT${ARCH}.EFI on non-removeable media as well.
1. 3.4.1.2 is a bit messy. It says in paragraph 2 that default boot processing behavior may optionally occur. Then proceeds to propose how it will occur, if it optionally occurs, using a file to be located in an optional directory per 12.3.1.3.
2. It doesn't at all indicate who should do this. If anything 12.3.1.3 implies it's vendor domain. Not operating system domain.
Given there's no mandate that this subdirectory or file be created, let alone used by the firmware, I don't see how this is your bug, as you put it.
I see no reason it isn't conforming to the current version of the spec. In fact, I don't see any reason it isn't conforming to /any/ version of the spec. The default behavior prior to 2.3 was to iterate all removable devices and do what's specified there, and then if that fails, iterate all "fixed media" devices and do something completely unspecified.
The intent of 3.3 plus 12.3.1.3 is rather clear that the idea is to result in the booting of an operating system or maintenance utility. The previously bootable disk is not malformed, the computer simply lacks the proper efi boot variable in NVRAM, a completely understandable condition, if not common. And yet this firmware shits its pants.
And in 20 years such a thing would never occur on Apple hardware in the same context, which have had a keyboard command used on startup specifically designed to obliterate the contents of NVRAM. And firmware that knows how to reasonably intelligently recover from such condition.
If we don't put a file there, the firmware is /in no way/ required to do anything in particular. It's *never* required to default to running UEFI applications not specified by Boot#### variables that are included in BootOrder and also do not match the path /BOOT/EFI/BOOT${ARCH}.EFI .
What is your interpretation of the first four sentences of paragraph 2 of 3.3? To me that means the firmware is required to create a new boot order, not save to NVRAM, and attempt to boot from each option. Obviously the only required directory in EFI//EFI is the operating system vendor's subdirectory containing their EFI boot image, and the intent of this section is for that to be used.
It's wholly irrational for a user to move a disk from one computer to another and to get either puke in the face (the OP's experience) or even a vendor provided maintenance utility, rather than booting the singular obvious option on the non-removable disk, in this case the only frigging option that could possibly boot the hardware. That it's the same model makes the experience beyond absurd.
Chris Murphy
On Thu, Jun 28, 2012 at 01:54:13PM -0600, Chris Murphy wrote:
It's wholly irrational for a user to move a disk from one computer to another and to get either puke in the face (the OP's experience) or even a vendor provided maintenance utility, rather than booting the singular obvious option on the non-removable disk, in this case the only frigging option that could possibly boot the hardware. That it's the same model makes the experience beyond absurd.
The only obvious thing for it to boot is EFI/BOOT/BOOT${ARCH}.efi. Booting the first EFI executable you find on a drive is not a sensible thing to do. Even Apple don't do that. Install Linux (only) on a Mac, zap the PRAM, see what happens - it'll boot if there's a blessed bootloader on an HFS+ partition, not otherwise.
On Jun 28, 2012, at 1:59 PM, Matthew Garrett wrote:
The only obvious thing for it to boot is EFI/BOOT/BOOT${ARCH}.efi.
An optional file in an optional vendor subdirectory is the obvious choice? Maybe a future spec could be more clear that the subdirectory and an EFI image in it are required, who should provide it, and that it should be used first in the case of invalid or missing BootOrder variables in NVRAM.
This is still in between ambiguous and optional in 2.3.1.
Booting the first EFI executable you find on a drive is not a sensible thing to do.
Puking in the face of the user with an incoherent boot failure message is more sensible than trying the singular boot loader on the available non-removable drive?
I admit this strategy can also cause problems, and the UEFI spec isn't particularly helpful[1] in resolving the problem of removed operating systems, with residual boot loaders that point to them. But that is no worse, and still likely to generate a more coherent boot loader produced "can't find blah" message, than the OP's experienced rat race of an error message.
Even Apple don't do that. Install Linux (only) on a Mac, zap the PRAM, see what happens - it'll boot if there's a blessed bootloader on an HFS+ partition, not otherwise.
They have a vendor defined order, which 3.3 allows, even though Apple EFI is not UEFI. When PRAM is zapped, the NVRAM is empty and nothing is blessed, therefore the sequence I described earlier applies. That it may fail on a singular valid boot loader in EFI//EFI/redhat/grub.efi I'll take your word on, I haven't tried it and if so it's pathetic but also really unsurprising.
And notwithstanding their non-standard EFI and ensuing problems and incompatibility it has cause, the hardware does provide a vastly superior UX in the same situation as the OP. Apple hardware absolutely would have booted. Unquestionably. And this is not a boot loader feature, or an OS feature, it is a firmware behavior.
[1] Failure of the spec to use "must release" instead of "can release". UEFI v2.3.1, section 2.1.3: If the OS loader experiences a problem and cannot load its operating system correctly, it can release all allocated resources and return control back to the firmware via the Boot Service Exit() call.
Chris Murphy
On 06/28/2012 05:03 PM, Chris Murphy wrote:
On Jun 28, 2012, at 1:59 PM, Matthew Garrett wrote:
The only obvious thing for it to boot is EFI/BOOT/BOOT${ARCH}.efi.
An optional file in an optional vendor subdirectory is the obvious choice? Maybe a future spec could be more clear that the subdirectory and an EFI image in it are required, who should provide it, and that it should be used first in the case of invalid or missing BootOrder variables in NVRAM.
This is still in between ambiguous and optional in 2.3.1.
Booting the first EFI executable you find on a drive is not a sensible thing to do.
Puking in the face of the user with an incoherent boot failure message is more sensible than trying the singular boot loader on the available non-removable drive?
There's no way to know if a UEFI application is a boot loader. You're as likely to accidentally run a firmware raid setup utility or the debug programs we put there with gnu-efi.
I admit this strategy can also cause problems, and the UEFI spec isn't particularly helpful[1] in resolving the problem of removed operating systems, with residual boot loaders that point to them. But that is no worse, and still likely to generate a more coherent boot loader produced "can't find blah" message, than the OP's experienced rat race of an error message.
The UEFI spec is in fact quite helpful, we just haven't done the thing it says to do yet.
On Jun 28, 2012, at 3:13 PM, Peter Jones wrote:
There's no way to know if a UEFI application is a boot loader. You're as likely to accidentally run a firmware raid setup utility or the debug programs we put there with gnu-efi.
Well that seems rather limiting, and problematic.
I admit this strategy can also cause problems, and the UEFI spec isn't particularly helpful[1] in resolving the problem of removed operating systems, with residual boot loaders that point to them. But that is no worse, and still likely to generate a more coherent boot loader produced "can't find blah" message, than the OP's experienced rat race of an error message.
The UEFI spec is in fact quite helpful, we just haven't done the thing it says to do yet.
The optional thing it says you may do, without saying what that is or how to do it, and doesn't require it, doesn't require the subdirectory you want to use, doesn't require it be honored, nor requires the OS vendor to do any of this.
Quite helpful.
This is actually wrong as well. Blessing is a property of the filesystem on modern macs.
It's more correct to say blessing is a property in NVRAM and the filesystem if it is HFS+. The primary mechanism is NVRAM, the fallback is in the HFS+ volume header. It used to be only a property of HFS long ago when NVRAM was tiny.
Chris Murphy
On 06/28/2012 05:03 PM, Chris Murphy wrote:
They have a vendor defined order, which 3.3 allows, even though Apple EFI is not UEFI. When PRAM is zapped, the NVRAM is empty and nothing is blessed, therefore the sequence I described earlier applies.
This is actually wrong as well. Blessing is a property of the filesystem on modern macs.
On Thu, Jun 28, 2012 at 03:03:55PM -0600, Chris Murphy wrote:
On Jun 28, 2012, at 1:59 PM, Matthew Garrett wrote:
The only obvious thing for it to boot is EFI/BOOT/BOOT${ARCH}.efi.
An optional file in an optional vendor subdirectory is the obvious choice? Maybe a future spec could be more clear that the subdirectory and an EFI image in it are required, who should provide it, and that it should be used first in the case of invalid or missing BootOrder variables in NVRAM.
It's not a vendor subdirectory. It belongs to the spec. It's also clearly not required, since you can have an entirely functional system without it.
Booting the first EFI executable you find on a drive is not a sensible thing to do.
Puking in the face of the user with an incoherent boot failure message is more sensible than trying the singular boot loader on the available non-removable drive?
Yes. Of course any useful EFI implementation should then have an interface to choose your bootloader, but that's a somewhat separate issue.
Even Apple don't do that. Install Linux (only) on a Mac, zap the PRAM, see what happens - it'll boot if there's a blessed bootloader on an HFS+ partition, not otherwise.
They have a vendor defined order, which 3.3 allows, even though Apple EFI is not UEFI. When PRAM is zapped, the NVRAM is empty and nothing is blessed, therefore the sequence I described earlier applies. That it may fail on a singular valid boot loader in EFI//EFI/redhat/grub.efi I'll take your word on, I haven't tried it and if so it's pathetic but also really unsurprising.
Apple's firmware will only attempt to load either blessed files or the fallback path. The behaviour is basically identical to the one you're complaining about.
And notwithstanding their non-standard EFI and ensuing problems and incompatibility it has cause, the hardware does provide a vastly superior UX in the same situation as the OP. Apple hardware absolutely would have booted. Unquestionably. And this is not a boot loader feature, or an OS feature, it is a firmware behavior.
What? No, Apple hardware wouldn't have booted. The only scenario in which it would have is if you had a blessed bootloader, which is clearly massively outside the EFI specification since it relies on HFS+.
On Jun 28, 2012, at 3:25 PM, Matthew Garrett wrote:
An optional file in an optional vendor subdirectory is the obvious choice? Maybe a future spec could be more clear that the subdirectory and an EFI image in it are required, who should provide it, and that it should be used first in the case of invalid or missing BootOrder variables in NVRAM.
It's not a vendor subdirectory. It belongs to the spec. It's also clearly not required, since you can have an entirely functional system without it.
12.3.1.3 "optional vendor subdirectory called BOOT". Although it's vendor in-specific.
Puking in the face of the user with an incoherent boot failure message is more sensible than trying the singular boot loader on the available non-removable drive?
Yes. Of course any useful EFI implementation should then have an interface to choose your bootloader, but that's a somewhat separate issue.
Ok well we disagree then because I consider this extremely user hostile behavior. There is only one choice. UI not required because the user isn't needed to choose ONE OPTION. And choosing that one option no matter what it is, statistically it's going to be a boot loader for the only operating system on the drive, which is the 99% use case. The use case in the example that started the thread.
How about we save the firmware puke in the face for when there's meaningful ambiguity involved?
Apple's firmware will only attempt to load either blessed files or the fallback path. The behaviour is basically identical to the one you're complaining about.
The behavior I care about, is results. Swap hard drives, even dual boot, between two Apple computers, and they still boot. Lenovo example in this thread, does not boot in the same case. These are not identical behaviors.
And so far all the Apple hardware I've tried does actually fall back to /EFI/BOOT/BOOTx64.efi unlike a lot of UEFI hardware.
And notwithstanding their non-standard EFI and ensuing problems and incompatibility it has cause, the hardware does provide a vastly superior UX in the same situation as the OP. Apple hardware absolutely would have booted. Unquestionably. And this is not a boot loader feature, or an OS feature, it is a firmware behavior.
What? No, Apple hardware wouldn't have booted. The only scenario in which it would have is if you had a blessed bootloader, which is clearly massively outside the EFI specification since it relies on HFS+.
Seems like a deficiency of the UEFI specification that a dinky ass company thought of this problem 20 years ago and solved it with file system metadata. There is no reason why the UEFI spec can't do as good or better than this, and make it standard to write out an NVRAM equivalent file, or other metadata, on the EFI System partition, to resolve the ambiguity.
Chris Murphy
On Thu, 2012-06-28 at 17:55 -0600, Chris Murphy wrote:
How about we save the firmware puke in the face for when there's meaningful ambiguity involved?
Who is the 'we' here? Any conceivable 'we' which might be held to exist in the context of the Fedora development list does not, to me, seem to include 'Lenovo firmware engineers'.
Whether you're right or Peter is (my money's on Peter...), this argument seems almost a sideshow: even if you're right and Lenovo's UEFI firmware implementation is a 'bad' one, so what? Manufacturers have been shipping bad firmwares for decades and there are no signs that this is going to stop in the glorious new UEFI era. It has long been established that, in practice, we do our best to work around poor firmware implementations where we can. Even if you win the argument, we'll probably _still_ wind up doing what Peter has proposed in Fedora. Essentially it seems to me that all you're arguing about is whether we call that 'implementing the UEFI spec' or 'working around poor UEFI implementations', which doesn't seem like something it's worth wasting a day's email time arguing about.
On Thu, Jun 28, 2012 at 05:55:09PM -0600, Chris Murphy wrote:
The behavior I care about, is results. Swap hard drives, even dual boot, between two Apple computers, and they still boot. Lenovo example in this thread, does not boot in the same case. These are not identical behaviors.
Yes, because HFS+ lets you put a pointer to a bootloader in the superblock and FAT doesn't. If you don't have a suggestion for how to make this work better with FAT then I don't think this thread is useful. Serialising nvram contents isn't an especially good suggestion.
On Jun 28, 2012, at 8:52 PM, Matthew Garrett wrote:
On Thu, Jun 28, 2012 at 05:55:09PM -0600, Chris Murphy wrote:
The behavior I care about, is results. Swap hard drives, even dual boot, between two Apple computers, and they still boot. Lenovo example in this thread, does not boot in the same case. These are not identical behaviors.
Yes, because HFS+ lets you put a pointer to a bootloader in the superblock and FAT doesn't. If you don't have a suggestion for how to make this work better with FAT then I don't think this thread is useful. Serialising nvram contents isn't an especially good suggestion.
You and Peter may be desensitized to shitty computer behavior, and specs. But consider that Kamil's sequence would not have failed to boot on legacy BIOS+MBR hardware either.
I find it surprising that a 2200 page spec, and the efforts of the UEFI Forum result in such spectacular failure, in a common and unremarkable situation. It seems exceptionally regressive.
Curious, how are manufacturer's using bulk imaged disks, separate from the computers they will be installed in, and yet the computers still manage to UEFI boot? I can't believe manufacturers would give up bulk imaging capability, or have someone type commands into each machine's NVRAM.
Chris Murphy
On 06/28/2012 03:54 PM, Chris Murphy wrote:
It doesn't at all indicate who should do this. If anything 12.3.1.3 implies it's vendor domain. Not operating system domain.
It's completely obvious that if we want something to happen, we have to do it.
Given there's no mandate that this subdirectory or file be created, let alone used by the firmware, I don't see how this is your bug, as you put it.
It's a tool for us to use. Right now we don't.
I see no reason it isn't conforming to the current version of the spec. In fact, I don't see any reason it isn't conforming to /any/ version of the spec. The default behavior prior to 2.3 was to iterate all removable devices and do what's specified there, and then if that fails, iterate all "fixed media" devices and do something completely unspecified.
The intent of 3.3 plus 12.3.1.3 is rather clear that the idea is to result in the booting of an operating system or maintenance utility. The previously bootable disk is not malformed, the computer simply lacks the proper efi boot variable in NVRAM, a completely understandable condition, if not common. And yet this firmware shits its pants.
/EFI/BOOT/BOOT${arch}.EFI *is* the maintenance utility the spec refers to.
If we don't put a file there, the firmware is /in no way/ required to do anything in particular. It's *never* required to default to running UEFI applications not specified by Boot#### variables that are included in BootOrder and also do not match the path /BOOT/EFI/BOOT${ARCH}.EFI .
What is your interpretation of the first four sentences of paragraph 2 of 3.3? To me that means the firmware is required to create a new boot order, not save to NVRAM, and attempt to boot from each option. Obviously the only required directory in EFI//EFI is the operating system vendor's subdirectory containing their EFI boot image, and the intent of this section is for that to be used.
No. In fact, the spec specifically states: "These new default boot options are not saved to non volatile storage." That is, it is not allowed to create new BootOrder or Boot#### variables. That's the OS's job.
It's wholly irrational for a user to move a disk from one computer to another and to get either puke in the face (the OP's experience) or even a vendor provided maintenance utility, rather than booting the singular obvious option on the non-removable disk, in this case the only frigging option that could possibly boot the hardware. That it's the same model makes the experience beyond absurd.
It can be obvious to you and still incompatible with the reasonably working model the spec provides.
On Jun 28, 2012, at 2:01 PM, Peter Jones wrote:
The intent of 3.3 plus 12.3.1.3 is rather clear that the idea is to result in the booting of an operating system or maintenance utility. The previously bootable disk is not malformed, the computer simply lacks the proper efi boot variable in NVRAM, a completely understandable condition, if not common. And yet this firmware shits its pants.
/EFI/BOOT/BOOT${arch}.EFI *is* the maintenance utility the spec refers to.
The spec says operating system or maintenance utility.
And you're still referring to a vendor subdirectory that's optional in the spec, and a 3.4.1.2 is also optional, but if the option is taken the vendor firmware it to behave in the described manner. You have zero assurance any firmware will conform, except by shear laziness on the part of firmware vendors who may prefer a singular hard code path default fallback, rather than having to scan the EFI system partition and come up with a dynamically generated boot list.
What is your interpretation of the first four sentences of paragraph 2 of 3.3? To me that means the firmware is required to create a new boot order, not save to NVRAM, and attempt to boot from each option. Obviously the only required directory in EFI//EFI is the operating system vendor's subdirectory containing their EFI boot image, and the intent of this section is for that to be used.
No. In fact, the spec specifically states: "These new default boot options are not saved to non volatile storage." That is, it is not allowed to create new BootOrder or Boot#### variables. That's the OS's job.
I'm not saying otherwise. I'm saying the spec specifically requires the firmware scan for new default boot options, does not store them in NVRAM, but does try to use them in sequence (vendor defined) to boot the system. BOOTx64.EFI is a fallback position. The behavior in 3.3 is longstanding and was left open ended without a final fail safe, which is the obvious point of bootx64.efi.
There are millions of firmware out there not conforming to 2.3.1 and hence not to 3.4.1.2 anyway.
It's wholly irrational for a user to move a disk from one computer to another and to get either puke in the face (the OP's experience) or even a vendor provided maintenance utility, rather than booting the singular obvious option on the non-removable disk, in this case the only frigging option that could possibly boot the hardware. That it's the same model makes the experience beyond absurd.
It can be obvious to you and still incompatible with the reasonably working model the spec provides.
I'm not bitching about the spec, I'm bitching about the firmware in the context of the OP's described experience. The intent of 3.3 is to avoid failure. It predates 3.4.1.2. The user is experiencing boot failure. I don't see 3.3 being at all in Fedora's domain to solve. It's a firmware problem. Not an OS problem. Not a spec problem.
Chris
On Thu, Jun 28, 2012 at 02:22:48PM -0600, Chris Murphy wrote:
I'm not bitching about the spec, I'm bitching about the firmware in the context of the OP's described experience. The intent of 3.3 is to avoid failure. It predates 3.4.1.2. The user is experiencing boot failure. I don't see 3.3 being at all in Fedora's domain to solve. It's a firmware problem. Not an OS problem. Not a spec problem.
The OS is expected to drop a utility in a well-known location in order to ensure that the firmware can do something sensible with 3.3. We're not doing that. What do you actually want the firmware to do here?
On Jun 28, 2012, at 2:51 PM, Matthew Garrett wrote:
On Thu, Jun 28, 2012 at 02:22:48PM -0600, Chris Murphy wrote:
I'm not bitching about the spec, I'm bitching about the firmware in the context of the OP's described experience. The intent of 3.3 is to avoid failure. It predates 3.4.1.2. The user is experiencing boot failure. I don't see 3.3 being at all in Fedora's domain to solve. It's a firmware problem. Not an OS problem. Not a spec problem.
The OS is expected to drop a utility in a well-known location in order to ensure that the firmware can do something sensible with 3.3.
I don't see how 3.3 or 3.4 burdens the OS vendor with this utility. 3.3 burdens the firmware and firmware vendor with determining the boot options order, and attempting to boot from each option - with the goal of booting either an operating system or a utility.
And how do you read 3.4.1.2's "default boot processing behavior may optionally occur" because that seems to render everything subsequent as optional for everyone, and still lacks explicit mention that the OS vendor is expected to provide a utility.
Further this seems to present a conflict with the abstraction intent of UEFI between OS and firmware, if the OS is required/expected to produce a utility so the firmware knows WTF to go do with itself.
We're not doing that. What do you actually want the firmware to do here?
Conform to the burden placed on it by 3.3. Scan, produce new vendor defined boot options, then attempt to boot from each option.
Chris Murphy
On Thu, Jun 28, 2012 at 12:04:41PM -0600, Chris Murphy wrote:
I don't know what UEFI version Lenovo purports to conform to, but the lack of an EFI//EFI/BOOT/BOOTx64.EFI image isn't an excuse for it failing to boot a previously bootable disk that is in no way malformed. This seems to be a case of bad firmware behavior that isn't conforming to section 3.3 of the spec.
Swapping a drive between machines means that you now have a drive UUID that doesn't match any of the boot options. 3.3 says that it should then attempt to boot from the device, and the only spec-defined boot location is EFI/BOOT/BOOT(machine type).efi. It seems to conform to the spec perfectly.
devel@lists.stg.fedoraproject.org