Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the robustness of value retention for the variable second_stage #626

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Jurij-Ivastsuk
Copy link

in function: shim_init(void)
in function: EFI_STATUS set_second_stage (EFI_HANDLE image_handle)
second_stage is set to \grubx64.efi
in the further course of time:
in function: efi_status = init_grub(image_handle);
EFI_STATUS init_grub(EFI_HANDLE image_handle)
second_stage variable - no longer has any value !!! the value \grubx64.efi has been lost
see attached screen foto:
IMG_5120
We have localized this problem:
in the file load-options.c in function parse_load_options(EFI_LOADED_IMAGE *li) the variable loader_str was initialized as CHAR16 *loader_str = NULL;
In the same file line Nr 445 the variable loader_str has received a non-printable ascii character from the function split_load_options(). This means that the variable does not have a NULL value anymore but have got the value "non-printable ascii character". After further check (line 454): if (loader_str) if variable has still NULL value the result was positive (loader_str is not NULL and it has a value) and the variable second_stage (\grubx64.efi) has been overwritten by a non-printable ascii character, as you can see on the screen-foto.
To improve this situation we have introduced the additional check of the first character of the variable loader_str in order to be sure, that at least first character is not a non-printable ascii character:
Line 454 if (loader_str && isprint(loader_str[0])). For this purpose we use standard function isprint(). This function returns true if the character is printable ascii character.
In this way we have solved the problem with non-printable ascii characters and overwriting the originally set value \grubx64.efi for second_stage through the non printable char. I think this improvement will be useful for all users of Shim boot-loader because this situation with non printable characters, as we see, can happen.

@Jurij-Ivastsuk
Copy link
Author

After some research I found a similar solution. It starts at exactly the same point as I did and checks whether the value of this variable represents a valid path before overwriting the variable "second_stage":
https://github.com/rhboot/shim/commit/36adff84a8251f0593e2c524268116c05688170b

Various checks are implemented there, but the check whether the first character contains a non-printable ASCII character is omitted. If you are looking for a general solution to this problem, perhaps a combination of both solutions: mine and the one just mentioned would be a better solution that takes several eventualities into account. But in my opinion, one should make a fundamental consideration whether the function parse_load_options() has any justification at all.

@jprvita
Copy link
Contributor

jprvita commented Feb 3, 2024

After some research I found a similar solution. It starts at exactly the same point as I did and checks whether the value of this variable represents a valid path before overwriting the variable "second_stage": https://github.com/rhboot/shim/commit/36adff84a8251f0593e2c524268116c05688170b

Various checks are implemented there, but the check whether the first character contains a non-printable ASCII character is omitted. If you are looking for a general solution to this problem, perhaps a combination of both solutions: mine and the one just mentioned would be a better solution that takes several eventualities into account. But in my opinion, one should make a fundamental consideration whether the function parse_load_options() has any justification at all.

The link above is broken, it looks like you are linking to a commit that does not exist in this repository. I guess you are referring to some other PR that got updated since, so the commit object got replaced. If that is the case, have you tried to raise this point on such PR? Also, as a general practice, it is better to link to PRs instead of commits, since PR numbers are persistent.

Also, the commits on this PR do not have a commit message and the PR does not pass CI. Adding a good commit message makes it easier to review changes.

@Jurij-Ivastsuk
Copy link
Author

@jprvita Hi, thank you very much for your comment.
At the time I wrote this message, the link was still working... I couldn't find this post either. Funny how posts can just disappear like that. A good tip from you for the future: only share the link on PR.

Ok but the main issue is that the function parse_load_options() does not always return a valid path for the variable second_stage. My improvement is actually that the value after the parse_load_options() function is checked for non-printable ASCII characters. This reduces the probability of an error situation occurring at this point.

@AndrewSav
Copy link

commit

@Jurij-Ivastsuk
Copy link
Author

@AndrewSav Thank you very much! That's exactly the link for Commit I was looking for!

@Metabolix
Copy link

Metabolix commented Feb 15, 2024

Although this solution technically safe (meaning that it won't open any new security holes), I don't think you should use isprint on CHAR16, which is not even ASCII. Did you actually check what the buffer contains and where it comes from? Maybe it's the same BCDEdit problem as #621? The characters in that issue are perfectly printable if the font supports them, so if isprint fixes the issue, it's mostly good luck.

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Hi, thank you very much for your opinion.

regarding isprint, I found the following:
Information from https://www.programiz.com/c-programming/library-function/ctype.h/isprint
Those characters that occupies printing space are known as printable characters.
Even though, isprint() takes integer as an argument, character is passed to the function.
Internally, the character is converted to its ASCII value for the check.
If a character passed to isprint() is a printable character, it returns non-zero integer, if not it returns 0.

It seems that the problem is slightly different as in case from #621. However, what seems to be common is that the function parse_load_options() does not always return the valid value for path. What I found out is the following: the first character after the parse_load_options() function is ASCII character dec 1, so this is a non-printable ASCII character. Non-printable ASCII characters are characters dec from 0 to 32. Of course, we have not solved all problems with the isprint function, but as in our case, at least non-printable characters can be recognized and so the variable second_stage can keep its default value "\grubx64.efi". With this function we only ensure that if such a situation arises (non-printable character) then the default value is not overwritten.
Of course, at this point other combinations of ASCII characters can also occur that we cannot foresee. But at least the current problem we have just experienced is solved.

@Metabolix
Copy link

Metabolix commented Feb 15, 2024

C actually allows locale-dependent charset instead of just ASCII, see https://en.cppreference.com/w/c/string/byte/isprint

But luckily shim seems to define isprint simply as:

#define isprint(c) ((c) >= 0x20 && (c) <= 0x7e)

So it's ok to use.

The side effect is that your file name can now only start with an ASCII character. That's probably not a problem in practice but still a slight limitation, ruling out non-English characters.

On the other hand, you will still have a problem if the first character happens to be good but the rest are garbage. Why not check the whole buffer up to the \0 while you're at it? That might be a nearly perfect solution, if supporting only ASCII characters is sufficient.

Have you determined where the bad data comes from in your case? Is it a bad boot entry or a firmware problem, or what? The function doesn't make it up by itself.

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Hi, thank you for review and your suggestions!
This is definitely not a firmware issue because we are getting the same error on different hardware. This is also not a boot entry problem either, because on different hardware the boot-entrys are different. We have tested the Shim 15.7 and 15.8 (without our patch) on different machines (those with Windows and Linux installations). Our use is that we always start the Shim from the external medium. Maybe that's why this error has only become visible now. Unfortunately, we could not perform a deeper debugging of this case.

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Hi Lauri,
thanks again for your review, as I understand you are pretty close to understanding shim code. Could you please say something about my testing problem? (#639)

@Metabolix
Copy link

@Metabolix Hi Lauri, thanks again for your review, as I understand you are pretty close to understanding shim code. Could you please say something about my testing problem? (#639)

Unfortunately not. This second_stage / parse_load_options is the only part I know, based on thoroughly investigating a similar issue. I have nothing to say about signing or certificates.

@Metabolix
Copy link

Can you please explain how you create the boot entry (what OS, what tool) and what are the exact contents of the boot entry (for example, hexdump the entry from Linux /sys/firmware/efi/efivars/Boot000*). That would make it easier to verify the best solution for the case.

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Hi, many thanks for your ideas!

A boot entry is stored in the firmware's boot manager, which can be accessed during the installation of OS or modified just after the installation. The prerequisite for this is that the operating system remains permanently installed on the hardware.

However, our boot environment is located on the separate boot medium and the Linux environment is started from there. This means that we have no way of permanently binding our boot environment to a specific hardware. Consequently, if our Linux boot environment is repeatedly started on the other hardware, we will always have different boot entry contents that we are unable to customize accordingly. This means that the contents of /sys/firmware/efi/efivars/Boot000* will always be unpredictably different.

Therefore, from my point of view, the solution to the problem for our case is to improve the code of Shim. Specifically: the original code is missing the check of the output of the split_load_options() function. This is where the check should take place to determine whether the string has a valid path or not. The original code only checks whether the output is not NULL. This is too little, especially for the case like ours.

@Metabolix
Copy link

Metabolix commented May 9, 2024

Just to get this straight: do you create an external boot medium, like a live cd, which is booted by selecting "boot from cd" or "boot from usb" or some such? So that you don't have an own boot entry at all? This would mean that any extra data passed to shim is generated by the firmware.

I don't doubt that your patch works for you, but to convince the maintainers to accept a patch upstream, it's probably necessary to have more data than just "works for me".

It would be instructive to get your exact work flow (where is shim and how do you boot it) and a complete hex dump of the whole load_options. Maybe you could create the dump in parse_load_options for debugging purposes.

As I noted earlier, simply limiting the first character to printable ASCII doesn't reliably block all possible bad paths (with bad data at a later point), and it blocks also valid non-ascii paths, so it's not a trivially perfect solution.

@Jurij-Ivastsuk
Copy link
Author

Jurij-Ivastsuk commented May 9, 2024

@Metabolix Hi, thanks for your comments!

... do you create an external boot medium, like a live cd, which is booted by selecting "boot from cd" or "boot from usb" or some such?
So that you don't have an own boot entry at all? This would mean that any extra data passed to shim is generated by the firmware.

That is exactly our case.

Our Linux environment is located on the CD, USB stick or USB drive. The boot process is initiated by selecting Boot Device from the boot menu.
Shim is located in /efi/boot/ of the boot-medium The shim file is renamed to bootx64.efi. This is also where grubx64.efi is located.
The shim starts grubx64.efi. Grub finds the config file where the path and the start parameters for kernel and initrd can be found.

Enclosed, I have put together the necessary hexdumps. All were found as you suggested in /sys/firmware/efi/efivars/Boot000*:

Boot0000-8be4df61-93ca-11d2-aa0d-00e098032b8c:

0000000 0007 0000 000a 0000 0025 0050 0030 003a
0000010 0020 0054 004f 0053 0048 0049 0042 0041
0000020 002d 0054 0052 0032 0030 0030 0020 0020
0000030 0020 0020 0020 0020 0020 0020 0020 0020
0000040 0020 0020 0020 0000 0401 0014 75e7 99e2
0000050 75a0 4b37 e6a2 38c5 6c5e cb00 ff7f 0004
0000060 0105 0009 0011 0000 7f00 04ff 0000
000006d

Boot0002-8be4df61-93ca-11d2-aa0d-00e098032b8c:

0000000 0007 0000 000a 0000 0025 0054 004f 0053
0000010 0048 0049 0042 0041 0000 0401 0014 75e7
0000020 99e2 75a0 4b37 e6a2 38c5 6c5e cb00 ff7f
0000030 0004 0105 0009 0012 0000 7f00 04ff 0000
000003f

Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c:

0000000 0007 0000 000a 0000 0025 0050 0031 003a
0000010 0020 0050 004c 0044 0053 0020 0044 0056
0000020 0044 002b 002f 002d 0052 0057 0020 0044
0000030 0048 002d 0031 0036 0041 0045 0053 0020
0000040 0020 0020 0020 0000 0401 0014 75e7 99e2
0000050 75a0 4b37 e6a2 38c5 6c5e cb00 ff7f 0004
0000060 0105 0009 0013 0000 7f00 04ff 0000
000006d

Boot0005-8be4df61-93ca-11d2-aa0d-00e098032b8c:

0000000 0007 0000 0001 0000 0060 0055 0045 0046
0000010 0049 0020 004f 0053 0000 0104 002a 0001
0000020 0000 0800 0000 0000 0000 0000 0010 0000
0000030 0000 14bc d7de d01d 412b 6590 dd42 55ee
0000040 1356 0202 0404 0030 005c 0045 0046 0049
0000050 005c 0042 004f 004f 0054 005c 0042 004f
0000060 004f 0054 0058 0036 0034 002e 0045 0046
0000070 0049 0000 ff7f 0004 0009
000007a

Boot0006-8be4df61-93ca-11d2-aa0d-00e098032b8c:

0000000 0007 0000 0001 0000 0094 0055 0045 0046
0000010 0049 003a 0020 0057 0044 0020 004d 0079
0000020 0020 0050 0061 0073 0073 0070 006f 0072
0000030 0074 0020 0030 0038 0032 0041 0031 0030
0000040 0031 0032 0000 0102 000c 41d0 0a03 0000
0000050 0000 0101 0006 1a00 0503 0006 0001 0503
0000060 0006 0006 0104 002a 0001 0000 0800 0000
0000070 0000 0000 0000 0040 0000 0000 6dfd 7d93
0000080 0000 0000 0000 0000 0000 0000 0101 ff7f
0000090 0004 0401 0044 47ef 2d64 3bc9 41a0 19ac
00000a0 514d 1bd0 e64c 0057 0044 0020 004d 0079
00000b0 0020 0050 0061 0073 0073 0070 006f 0072
00000c0 0074 0020 0030 0038 0032 0041 0031 0030
00000d0 0031 0032 0000 ff7f 0004 0000 4f42
00000de

BootCurrent-8be4df61-93ca-11d2-aa0d-00e098032b8c:

0000000 0006 0000 0006
0000006

BootOptionSupport-8be4df61-93ca-11d2-aa0d-00e098032b8c:

0000000 0006 0000 0001 0000
0000008

BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c:

0000000 0007 0000 0005 0006 0000 0002 0004
000000e

I hope this information is sufficient, if not please let me know.

@Metabolix
Copy link

Metabolix commented May 9, 2024

Thanks for the clarification. Like you said, the NVRAM boot entries are not used in your case. I was hoping for the data which parse_load_options tries to parse. Here's a small patch 0001-debug-dump.txt which you could apply to shim to get the dump visible on screen during boot and also store the data (temporarily) into /sys/firmware/efi/efivars/DebugShimLoadOptions-* so you should be able to access it in Linux. Also please use hexdump -C which makes it more readable wrt. byte order, at least in my opinion. :)

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Hi, after changing the code according to the patch 0001-debug-dump.txt and recompiling, I will send you the content of the file /sys/firmware/efi/efivars/DebugShimLoadOptions-*. This content was extracted with hexdump -C:

00000000 06 00 00 00 00 00 42 4f |......BO|
00000008

Or as file:
debug-dump.txt

@Metabolix
Copy link

Thanks! That's indeed invalid data. One more question, now that I digged around a bit more: Have you defined DISABLE_REMOVABLE_LOAD_OPTIONS, as instructed in BUILDING, which seems to be exactly meant for your use case? "This prevents boot failures because of unexpected data in boot entries automatically generated by firmware."

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Hi, many thanks for your hint! That actually worked! If I set DISABLE_REMOVABLE_LOAD_OPTIONS=y then the error with “Failed to open \EFI\BOOT\ - Invalid Parameter” no longer occurs. This means you can switch off the parsing load options when invoked as boot*.efi. This is OK.

So my problem is actually solved without the patch #626. But purely academically, do you really think that the checking whether the output of the split_load_options() function has a valid path is not necessary? Perhaps there are other constellations in the real world where a similar error can occur?

@Metabolix
Copy link

My use case would be covered by #621. A sanity check for the path would be nice and would fix the same problem, but I don't know what's the preferred way to do it.

  1. Is it ok to disallow non-ASCII characters? It would make things simple and fix many similar cases. The probability of getting a valid string by accident seems pretty low.
  2. Since parse_load_options actually reports invalid data in at least one case, it seems weird to simply ignore invalid data in other cases.
  3. The data which comes after a valid string is apparently passed to the second_stage loader (todo: verify this). If the path contains bad characters, should load_options then contain nothing (as with invalid or missing data), only the data after the bad string (as with an empty string) or all data including the bad string (as if there were an empty string before it)?

@Jurij-Ivastsuk
Copy link
Author

Jurij-Ivastsuk commented May 12, 2024

@Metabolix Hi, So I would approach this problem like this, first of all hypothetically, two phases of sanity-check must be carried out:
Phase 1

  1. we should define in principle which characters can be regarded as valid and which as invalid. I would define the non-printable ASCII characters, for example, as non-valid characters. All other characters can be allowed as a first guess.

Phase 2

  1. we must check whether the path that appears to be valid after the first check is actually valid. At this point we should check whether this path actually leads to an executable *.efi file or not.

Only the path that passes both phases of the check can be considered valid.
If the path is not recognized as valid, it retains the predefined default value as “\EFI\BOOT\bootx64.efi”, otherwise it is given the checked valid value.

@Metabolix
Copy link

  1. Disallowing only 0x00-0x1f is not very useful since it catches only a small portion of probably useless data. So I would rather have a decision about whether non-ASCII is required at all.
  2. There's already another place which reports that the file is not found. That's the message you were complaining about. If your goal is to just remove the message, why go through all the trouble, you could just delete the error messages. I don't think they should be removed. It's good to know if you have tried to setup a filename but the file is missing.

@Jurij-Ivastsuk
Copy link
Author

Jurij-Ivastsuk commented May 12, 2024

@Metabolix I agree with 1, but only partially. Therefore the 2 phases test.
If we could exclude the non-printable ASCII characters, then we would have blocked some of the non-valid path strings.

I don't think deleting the error messages is a good idea. After all, they are there to report certain incorrect constellations. Finally, after parsing the boot options, it is important to know whether you can use the path-to-boot-loader or not. My goal is definitely not to remove the error message, my goal is to have clean code that can handle all eventualities.

Unfortunately, I didn't understand your last question:

It's good to know if you have tried to setup a filename but the file is missing.

I tried to define the default loader (I used the DEFAULT_LOADER option). But this did not lead to a change, and the error “Invalid Parameter” remained. Is that what you mean?

@Metabolix
Copy link

Metabolix commented May 12, 2024

Yes, you will get a similar error message even if the file name (configured in NVRAM boot entry) is valid but the file just does not exist. I don't remember now if it's "invalid parameter" or some other text but the messages are there anyway. If your definition of "valid path" requires that the file exists, then there will be no way to report an error message about a missing file.

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Sorry that I may not have expressed myself clearly. but by “valid-path” I mean “valid-path-to-bootloader”, which means that if we can check that the boot-loader can be found and is accessible, then we can call the whole path, i.e. “path-to-bootloader”, valid. If the message about the boot-loader file is important, then we can check both. First the path alone, and if the path is valid, then the accessibility of the boot loader, i.e. path + boot loader.

@Metabolix
Copy link

Metabolix commented May 12, 2024

This boils down to the question, how do you know the difference between invalid (= should be ignored) and mistake (= should be reported).

Non-printable chars are probably invalid (like in your case).

Wrong arch (like grubaa64.efi instead of grubx64.efi) is probably a mistake and should be reported. Correct arch but missing file is even more likely a mistake.

What about a file name ending in \ or ., do you think it's invalid or a reportable mistake?

What happens with a strange but valid relative path like ..\guid={my-guid}\MyLoader.lol which may be correct but the file and/or dir may be simply missing by mistake?

Where do you draw the line? How complex heuristics do you make? Is there a case where you would consider the data good if the file exists but silently ignore as invalid if it doesn't?

I think that the printable ASCII check is a reasonable (simple but effective) heuristic to detect true invalid data to be ignored, if restricting the path to ASCII is generally acceptable. Personally I wouldn't add any other checks, because I think they would be useless and complex and it's so easy to come up with possible mistakes which then would be silently ignored.

If you suggest some other check, then at least explain why the check is needed (i.e. where would you get such invalid data by accident), and what happens in the examples I made above.

If you don't care about mistakes and just want to ignore all non-bootable paths, that gets us back to the question, why would you implement a new check for it when it's basically the same as just removing the error messages.

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Instead of answering individual questions, I would suggest that I present a scheme for the code structure.

  1. first, the entire path should be isolated as a string value (e.g. \EFI\BOOT\bootx64.efi) It is of course important to know how long this string is. Whether this value is null-terminated (\0)? This is the question that needs to be answered.
  2. we check if the string contains a non-printable ASCII character. If yes - we ignore the whole string and replace it with default string.
  3. we should split the string into two parts: Path (e.g. \EFI\BOOT) and the file name (e.g. bootx64.efi) The question that still needs to be clarified here is whether the relative path remains valid or not, as in your example. If not - then an error message.
  4. we check the accessibility of the path part, if the path is not accessible - then an error message appears; if the path is accessible, we add the file name to the path (e.g. \EFI\BOOT\bootx64.efi) and check the accessibility of the file. If yes - then the whole path ( \EFI\BOOT\bootx64.efi) is to be classified as valid, if no - an error message appears that the file is not accessible.
    File names “\” or “..” are initially classified as valid, but during the accessibility test they will lead to an error message.
    I think I could answer all your questions, if not please let me know.

@Metabolix
Copy link

I'll answer to your numbering:

  1. String detection already happens in split function.
  2. Ascii check is the real change which you are proposing.
  3. Splitting and checking parts again seems unnecessary. If one part of the string is bad then it is not accessible, so the accessibility check is enough.
  4. Accessibility is checked later in load_image and causes the error which you already saw. There is no need to re-implement this check in another place. (Especially the parse function is not the right place for checking accessibility.)

@Jurij-Ivastsuk
Copy link
Author

@Metabolix Regarding point 3, I only included this because you wanted to distinguish between two error situations in your message (= should be ignored) and (= should be reported). If this is not important, then point 3. can be ignored. In this code schema description, I did not want to specify any assignment to specific functions and their localization. It was just an idea of how Sanity-Check should basically look. If you think that the check for non-printable ASCII characters would be sufficient, then, from my point of view, that would be a step in the right direction.

@Jurij-Ivastsuk
Copy link
Author

Jurij-Ivastsuk commented May 13, 2024

@Metabolix Do you think my suggestion for checking non-printable ASCII characters could be accepted and included in the official Shim version?

@Metabolix
Copy link

@Metabolix Do you think my suggestion for checking non-printable ASCII characters could be accepted and included in the official Shim version?

Like I said before, I think it would be more useful to disallow all non-ASCII characters along with the non-nrintable ASCII characters. That is, only allow printable ASCII characters.

A clear benefit is that the default configuration would work out of the box for more use cases (like most of these removable media cases).

However I can't speak for the shim project.

@Metabolix
Copy link

Maybe @vathpela could comment on this idea about ignoring second stage path in load options if it contains non-printable and/or non-ASCII characters. It would solve #370 and #621 and also avoid problems with removable media like in this PR. Who needs those characters in a real-world setting anyway?

@Jurij-Ivastsuk
Copy link
Author

@vathpela Hello, we discussed together with @Metabolix about the sense of ignoring second stage path in load options and replacing it with default path if it contains non-printable or non-ASCII characters. Please see our discussion on this topic. Do you think it makes sense to introduce this check in the official Shim version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants