Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mcuboot: add support for new "copy with revert" algorithm #1902

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

michallenc
Copy link
Contributor

@michallenc michallenc commented Feb 20, 2024

Sumary

This commit introduces new algorithm discussed few weeks ago on dev mailing list. The goal of this algorithm is to speed up update process for devices with large images and a lot of free space on external flash memory. The biggest setback of swapping based algorithms is the amount of erases and writes performed on ota1 and otascratch. This may not be an issue if all partitions are in embedded flash, however having them on external flash prolongs the update by an extensive amount of time.

Presented algorithm solves this issue by removing scratch area and adding tertiary slot ota2. This algorithm does not swap images, but rather only copies the update image to primary slot. Recovery option is kept thanks to third slot as secondary or tertiary always keeps recovery image (and the other one is used for new update).

Overall, this algorithm allows to achieve the speed of overwrite only algorithm while retaining the revert/recovery option. It is especially useful for devices with larger images and a lot of free space on external flash.

Target support is currently added just for NuttX RTOS.

Testing

Tested with NuttX on SAMv7 MCU with 2 MB embedded flash. 1 Gb W25Q NOR was used as external flash. Almost 2 MB image takes about 3 minutes to update with swap algorithm with scratch area while it takes about 20 seconds with this new algorithm (once recovery image is created).

@michallenc michallenc force-pushed the copy-with-revert-new-alg branch 2 times, most recently from d45b11c to 7648989 Compare February 20, 2024 15:37
Copy link
Collaborator

@nordicjm nordicjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be adding a third image slot without a very good justification, original comment mentions swap using scratch, so just use swap using move instead?

@michallenc
Copy link
Contributor Author

We should not be adding a third image slot without a very good justification, original comment mentions swap using scratch, so just use swap using move instead?

But swap using move is limited to support sectors of the same sector layout (see https://github.com/mcu-tools/mcuboot/blob/main/boot/bootutil/src/swap_move.c#L296) thus often not usable for the combination of embedded and external flash. Moreover swap using move is still slower than the presented algorithm as you have to copy image to external flash during every update.

@nordicjm
Copy link
Collaborator

We should not be adding a third image slot without a very good justification, original comment mentions swap using scratch, so just use swap using move instead?

But swap using move is limited to support sectors of the same sector layout (see https://github.com/mcu-tools/mcuboot/blob/main/boot/bootutil/src/swap_move.c#L296) thus often not usable for the combination of embedded and external flash. Moreover swap using move is still slower than the presented algorithm as you have to copy image to external flash during every update.

That is something I believe @d3zd3z mentioned about recently and should be fixed by using the largest sector size of the two

@michallenc
Copy link
Contributor Author

michallenc commented Feb 23, 2024

We should not be adding a third image slot without a very good justification, original comment mentions swap using scratch, so just use swap using move instead?

But swap using move is limited to support sectors of the same sector layout (see https://github.com/mcu-tools/mcuboot/blob/main/boot/bootutil/src/swap_move.c#L296) thus often not usable for the combination of embedded and external flash. Moreover swap using move is still slower than the presented algorithm as you have to copy image to external flash during every update.

That is something I believe @d3zd3z mentioned about recently and should be fixed by using the largest sector size of the two

That would be great and would definitely help, but still the algorithm is much slower. I can get to ~20 seconds for 2 MB image now (if recovery is present), but swap using move would take about ~90 seconds (just theory calculation as swap using scratch takes more than 3 minutes). IMHO this algorithm is better for devices with large external flash memory.

@taltenbach
Copy link
Contributor

taltenbach commented Mar 12, 2024

I'm working on a project with MCUboot on an STM32F4 device and I'm very interested by this new algorithm, which I think brings significant benefits compared to the existing strategies, in particular for MCUs having large internal flash memory sectors. For example, the STM32F4 I'm using has 1.5 MiB of flash divided in sectors of 128 KiB. This means:

  • When using swap-move, one sector of internal flash is reserved for the swap-move algorithm and another sector for the image trailer.
  • When using swap-scratch, at least one sector of internal flash memory must be allocated to the scratch area (which must be located in internal flash, assuming the firmware is encrypted).

Therefore, at least 256 KiB (swap-move) or 128 KiB (swap-scratch) of internal flash cannot be used for storing the app, which represents in both cases a significant loss of space (resp. ~17% and ~8%). For projects having large applications like the one I'm working on, this is an issue.

Considering the app's secondary slot is located in external flash memory, using the "copy with revert" algorithm would eliminate the need to allocate one or multiple sectors in internal flash memory for enabling the swap, while still having the possibility to revert after the update. This comes of course at the cost of higher external flash memory usage.

So, in addition to improving the update speed, this new algorithm also helps reducing the internal flash memory overhead, which would definitely be valuable for projects similar to one I've described.

@ppisa
Copy link

ppisa commented Mar 15, 2024

I support the proposed method and wait for it for more of our projects. I see another reason to use this algorithm, swapping by smaller parts requires to expose internal FLASH to more erase cycle and even if they target different areas there can be negative influence on wearing of the FLASH. If you consider some safety microcontrollers (as TMS570) then they have very strict rules for guaranteed parameters and they cannot be used in critical application after 1000 erase cycles to the main program FLASH. If at least some of the cycles are eliminated even at cost of double or even triple and larger external FLASH size use then it is real win situation.

@michallenc
Copy link
Contributor Author

Hi @d3zd3z, @utzig, @nvlsianpu. Could you take a look at this? Thanks.

@utzig utzig removed their request for review March 31, 2024 23:12
@utzig
Copy link
Member

utzig commented Mar 31, 2024

Hi @d3zd3z, @utzig, @nvlsianpu. Could you take a look at this? Thanks.

Sorry for the delay. I've not been actively using MCUboot, or doing any embedded stuff as of late. This PR is quite big and I would rather not approve or merge, and defer the decision for the active maintainers.

Regarding the PR, the main issue is not having any patches for the simulator to test the new method, so it would require someone to manually test, and any change to any of the "shared" code could break the functionality without it being noticed until the day someone tries to use it. I don't remember if it's a prerequisite to have sim support, but I would not merge it otherwise.

I am also curious if you tested with a large scratch partition, with a size similar to the new partition, and if it also requires all the erases that slow the process down.

Also as mentioned above, adding support for swap move with different sector sizes should work here, and it's probably very straight-forward to do, at least for the simple case where there's a fixed size on each partition!

@michallenc
Copy link
Contributor Author

michallenc commented Apr 2, 2024

Regarding the PR, the main issue is not having any patches for the simulator to test the new method, so it would require someone to manually test, and any change to any of the "shared" code could break the functionality without it being noticed until the day someone tries to use it. I don't remember if it's a prerequisite to have sim support, but I would not merge it otherwise.

Hi, thanks for the reply. Yes, method implementation for a simulator is a reasonable requirement. I will take into it, but first I would like to have some confirmation the maintainers will be willing to consider this algorithm so I do not invest more time into simulator implementation without possibility of merging. Or at least some initial review. I got initial "go for it" from David Brown on mailing list few months ago so I hope there will be a consensus to add this to the mainline.

I am also curious if you tested with a large scratch partition, with a size similar to the new partition, and if it also requires all the erases that slow the process down.

I have not tried that large scratch partition, just 128 KB i think (with entire flash partition being about 2 MB). I can try larger scratch area by Friday, I hope, but I do not expect significant speed up as it still has to erase entire partition twice (firstly for primary -> scratch and secondly for scratch -> secondary). The presented algorithm does not erase secondary or tertiary slot at all during boot, it just erases primary slot (which is fast if located in embedded flash). Also wear leveling is lower here.

This commit generalizes boot_write_image_ok() function to take flag
as an input parameter. Function is also rename to boot_write_image_flag()
so the name matches the usage better.

This is useful to future implementation of different algorithms that
might need other flags than BOOT_FLAG_SET to be written to image
trailer.

Signed-off-by: Michal Lenc <michallenc@seznam.cz>
This algorithm uses three flash partitions to copy images to primary slot
without swap mechanism. This way much faster update process can be
achieved but more space on flash has to be allocated. This is basically
trade off between update speed and flash space taken for boot process.

The algorithm always keeps recovery image in either secondary or
tertiary slot and lets the user upload update image to the other one.
Once image is updated and confirmed, update slot is marked as recovery
(the image is already there uploaded by the user) and old recovery is
marked as new update -> user will upload new image there. This means
there are no writes to ota1 and ota2 partitions during boot process
except and therefore there is no speed limitation if usually slower
(compared to embedded flash) external NOR flash is used for these
partitions. The only exception is first update process where bootloader
has to create recovery image.

Overall, this algorithm allows to achieve the speed of overwrite only
algorithm while retaining the revert/recovery option. It is especially
useful for devices with larger images and a lot of free space on
external flash.

Signed-off-by: Michal Lenc <michallenc@seznam.cz>
This commit adds documentation entry for new algorithm.

Signed-off-by: Michal Lenc <michallenc@seznam.cz>
This slot is used in newly introduced copy with revert algorithm. This
commit allows NuttX to support this algorithm.

Signed-off-by: Michal Lenc <michallenc@seznam.cz>
@ppisa
Copy link

ppisa commented May 5, 2024

Please, what is the state for the case? At least some response from MCUboot core developers, if the discussion about alternative is possible and if they hear for preserving Flash from abundant wear and tear and what are requirements to discuss and possibly merge more decent alternative. We have multiple project at university and even my company (PiKRON) where we wait if MCUboot would be acceptable or we have to continue to use our own previous solution with update over special fixed boot block including USB, CAN and RS-485 communication. I have seen MCUboot as interesting alternative which allows to join forces together and keep boot block size minimal. But I consider copying back and fort over scratch area as really bad solution for cases where larger external memory is available and two complete versions of application can be stored there. I though that MCUboot offers this alternative initially which was my reason to consider it. But when I learned that is is not supported and it seems that it could be forced so indefinitely, I am quite reluctant to look how/if to integrate it our projects.

@nordicjm
Copy link
Collaborator

nordicjm commented May 7, 2024

It's a no from me, I assume it's a no from @de-nordic too but will let him comment

@ppisa
Copy link

ppisa commented May 7, 2024

It's a no from me, I assume it's a no from @de-nordic too but will let him comment

Thanks for response.

Please, can you elaborate alternative to keep Flash wear acceptable?

It would be even worse for MCUs like Ti TMS570/Hercules family (we have experience at PiKRON, designed HW of rapid prototyping platform for Porsche based on it) which has only one thousand guaranteed rewrites of internal program Flash. When it is used with MCUboot without external memory it will be dead immediately even that it has relatively large internal Flash. So if you do not consider keeping wear at reasonable level as valid argument for alternative algorithm, then there should be placed bold warning on the MCUboot pages that it is not suitable for such devices and will cause (unnecessarily) breakage even of other targets in time shorter by factor of two or more. There should be provided formula counting that factor from firmware size and scratch area size and again bold warning of limited usability of MCUboot.

@nordicjm
Copy link
Collaborator

nordicjm commented May 7, 2024

If you want to have MCUboot in a "firmware loader" mode, this means you have MCUboot, your main image and a specific firmware update application (which would update the main image) then you can do this, it meets the goal of not doing additional erases because you would erase the main slot and replace it.

If you want to still have 2 slots, then you have upgrade only mode which will replace the main application, this has the downside that your image must be perfect, if you load a bad update then the device is bricked. The alternative is to have direct XIP with revert, which means you load an update for a specific slot and boot it directly, there is no copying of data to the alternative slot, if the image boots successfully and passes any checks, it can mark itself as confirmed and will boot that image continually, if not then upon reboot (or watchdog timeout) the image will be reverted to the other.

I do not accept this because you are adding a third slot, this just breaks everything we have in zephyr, likely other OS's too, whilst it might work for your usecase it is not needed for the vast majority of everyone else's usecases and vastly complicates things (and as said, completely breaks zephyr support)

@michallenc
Copy link
Contributor Author

I do not accept this because you are adding a third slot, this just breaks everything we have in zephyr, likely other OS's too, whilst it might work for your usecase it is not needed for the vast majority of everyone else's usecases and vastly complicates things (and as said, completely breaks zephyr support)

It should not brake anything. Other algorithms still work as previously (tests on simulator are passing and I have done tests for swap with scratch manually on real hardware before submitting this PR). Type of used algorithm is selected by #ifdef, similarly to how you select between scratch algorithms, overwrite only etc. and definitions for third slot are required only if this algorithm is enabled. So in theory, this new code should not be even compiled for Zephyr and other systems unless correct defines are configured there. There might be some mistakes I missed so if you tried to compile my branch against Zephyr and found some issues I would appreciate if you send them.

In your initial comment, you also mentioned:

We should not be adding a third image slot without a very good justification

I personally think both me and other people (I work on some stuff with @ppisa, but @taltenbach is completely independent, so there is obviously wider usacase than just for me) provided good justification (faster boot, much lower wear leveling) for adding this algorithm.

@ppisa
Copy link

ppisa commented May 7, 2024

Thanks for elaborate. But I expect that if the update application needs to be updated (i.e update protocol security problem or some incremental changes, reliability increase) then there is significant trouble how to update it safely. I liked idea of MCUboot that update is done by main previously up to date application and it is guaranteed that if new image does not behave correctly (i.e. does not connects to some external controller by communication protocol which allows to run some checks and then send command to confirm the firmware as valid) then watchdog or next reset ensures return to previous up to date functional code. I really liked that idea and took it as argument for MCUboot over our previous solutions.

As the third slot protocol has been designed by @michallenc as the configurable option which is off by default, I do not see any downside for MCUboot use in the scenarios you describe. I agree that it represnets some maintenance burden but as I understood @michallenc offered to prepare test cases and infrastructure for that and I do not expect that it would cost much in build time and testing on the host system. But yes, there is valid argument that it is additional code and its maintenance cost should be compared to added value now. But I see really substantial added value and it seems that even some independent MCUboot user @taltenbach sees the same value as the CTU univesity, Elektroline and PiKRON companies. We can ask in https://github.com/robertobucher/pysimCoder (https://www.youtube.com/@robots5/videos) community which is looking for alternative to update over JTAG, but there could be approach with fxed updater and single application image functional because these devices are more for the local experiments than for production systems where they are not available for manual correction when running on the other side of the world (i.e. tram yards in Melbourne). May be more MCUboot users would see the value if the they are informed about this discussion.

@nordicjm
Copy link
Collaborator

nordicjm commented May 7, 2024

Have you tested any of this with the smp_svr sample in zephyr? Have you tried erasing the third slot?

@michallenc
Copy link
Contributor Author

michallenc commented May 7, 2024

Have you tested any of this with the smp_svr sample in zephyr? Have you tried erasing the third slot?

The third slot is currently implemented only for NuttX, it does not have any affect on Zephyr or other systems. Number of image slots is set to default 2 if MCUBOOT_COPY_WITH_REVERT option is not used, see:

#ifdef MCUBOOT_COPY_WITH_REVERT
#  define BOOT_NUM_SLOTS                  3
#else
#  define BOOT_NUM_SLOTS                  2
#endif

The only Zephyr related change is boot_write_image_ok rename to boot_write_image_flag, see commit 1acec48, and this should not break anything.

Regarding smp_svr sample in Zephyr. No, I did not tested this on Zephyr as I do not use the system at all. Tests were done on NuttX only (and checked if simulator tests pass). I have now tested once again NuttX build if MCUBOOT_COPY_WITH_REVERT is not set and third partition is erased from NuttX (basically revert of 8110ce5) and it did not cause any errors.

Did you have some troubles compiling this with Zephyr?

@utzig
Copy link
Member

utzig commented May 7, 2024

I am also curious if you tested with a large scratch partition, with a size similar to the new partition, and if it also requires all the erases that slow the process down.

Very straight-forward question which remains unanswered (or did I miss the answer?!)

When it is used with MCUboot without external memory it will be dead immediately even that it has relatively large internal Flash. So if you do not consider keeping wear at reasonable level

Ditto, same subject as above, there is no wear with a large scratch, maybe...

Tests were done on NuttX only

I don't think the maintainers usually use NuttX, and our CI does not have a NuttX job, maybe you'd like to PR one?

@michallenc
Copy link
Contributor Author

I am also curious if you tested with a large scratch partition, with a size similar to the new partition, and if it also requires all the erases that slow the process down.

Very straight-forward question which remains unanswered (or did I miss the answer?!)

As I have already said, there is no reason this would significantly improve anything. The algorithm remains the same, it still has (based on the docs) to erase scratch area, copy secondary to scratch, erase secondary, copy primary to secondary and finally copy scratch to primary. Yes, number of erases is lower with larger scratch, but you have to erase larger area. As a result, number of erased pages remains the same.

I have done the comparison on real hardware now and there is no significant improvement.

When it is used with MCUboot without external memory it will be dead immediately even that it has relatively large internal Flash. So if you do not consider keeping wear at reasonable level

Ditto, same subject as above, there is no wear with a large scratch, maybe...

Yes, wear leveling will be lower, but still higher compared to the proposed algorithm in this MR as there are more erases.

Tests were done on NuttX only

I don't think the maintainers usually use NuttX, and our CI does not have a NuttX job, maybe you'd like to PR one?

I do not mind adding NuttX CI job, it would benefit both NuttX and mcuboot in my opinion. But as I have already said, I would like at least some reassurement this has a chance of being merged. Because so far it seems there is an opposition against adding the possibility of third slot in general, regardless CI and tests (which is of course your right to reject it).

@d3zd3z
Copy link
Member

d3zd3z commented May 9, 2024

The issue with swap-move needing a separate sector for the image trailer is actually a bug, and needs to be investigated and fixed. It should be possible to do swap-move with two sectors in the first slot, and one sector in the upgrade slot. swap-move should not require more space than swap scratch.

@michallenc
Copy link
Contributor Author

Hello all. I have just tried Zephyr for SAMv71 board and my MCUboot changes and both build and subsequent boot (tested with simple hello_world program) works fine and without errors. So the presented changes should not brake current Zephyr (or other OS) support in any way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants