Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To adopt urboot #750

Open
mcuee opened this issue Mar 13, 2023 · 15 comments
Open

To adopt urboot #750

mcuee opened this issue Mar 13, 2023 · 15 comments

Comments

@mcuee
Copy link

mcuee commented Mar 13, 2023

urboot may be a more suitable bootloader than Optiboot for ATTinyCore. Take note urboot requires avrdude 7.1 release.
https://github.com/stefanrueger/urboot

FYI, MicroCore has adopted urboot for ATtiny13A.

@mcuee
Copy link
Author

mcuee commented Mar 13, 2023

Relevant discussion in Optiboot project.

The feature to erase flash from top to bottom has been implemented in urboot.

@SpenceKonde
Copy link
Owner

Wow, I had not seen this discussion. Yeah, if this works, I would ditch optiboot in an instant. At present virtual boot on optiboot is not reliable, they always eventually brick themselves, this has caused major major problems for people. I have tried several times and not been successful at adapting optiboot to do the erase correctly. If Urboot will work, and offers a virtualboot that works reliably, that would be huge. I don't think I'd even ship with optiboot if Urboot was an option. The problem with the classic tinies is their incredibly heterogenous nature. There's the 841 with 2 hardware serial ports and "4-page erase" ("We reduced the size of the page buffer by a factor of 4, erase is still the same size, but the pages are real small"). The internal oscillator on the 841 is NUTS. First, above 4V, it's output swerves upwards. Secondly, at 5v, it can be "tuned" all the way up to 16 MHz o_o All of the final four have 4-page erase and at least one usart. and they have PUE bitsfor each port to control the pullups. Then there's the 167, it doesn't have a USART, it has a LIN module, which is a USART with a really baller baud rate generator. And then we have the rest of them, which are still the most popular. And they're all weird in some way. The 85's timer1 is this wacky high speed timer, bleeeh. Same for the 861 - but the 861 could use one of 3 pins for the AC. meaning 3 obvious RX pins for the tinySoftSerial. And it has two output compare channels on timer0 THAT CAN ONLY GENERATE AN INTERRUPT. And had arguably the best ADC until the 2-series launched. The 841 also had a fancy ADC.

How does he handle the status flags? Hopefully by clearing and stashing? (you can't leave it to the app, the app won't do it, and without that, you can't trigger a reset if you end up in the bootloader without a reset flag being set (same thing happens when a nonexistent interrupt is called, it jumps to badisr which jumps to 0. but the app was built with the assumption that peripherals were in there reset state. Not the state they eventually set them to for the particular combination of settings that contains the bug. I'm convinced that this is the cause of the vast majority of the cases where arduinos become hung until power cycle or manual reset (which is generally significantly worse than an unplanned, but clean reset - particularly when devices are located in inconvenient places. (it's also not nearly as bad as optiboot misfiring when your devices are installed outdoors in canada in the winter, and having the boards slowly brick themselves. Oh - and the only way you know where they are is the gps coordinates they can no longer transmit because optiboot bricked them. That actually happened (they had hundreds of the fuckin things), which is why I so badly want to either fix optiboot or get something that does virtualboot right.

@mcuee
Copy link
Author

mcuee commented Mar 17, 2023

@stefanrueger
You are the best person to answer the queries from @SpenceKonde. Thanks.

@MCUdude
You can chime in to share as well. Thanks.

@stefanrueger
Copy link

stefanrueger commented Mar 17, 2023

Well, I don't give any warranties, but I think urboot vector bootloaders (my name for virtual bootloaders) are programmed carefully. They protect themselves from being overwritten by an eager uploader or an application utilising the bootloader's exported writePage(sram, flash) routine. There is a compile time option to protect the reset vector, ie, whenever the bootloader programs page 0 it puts an r/jmp to itself in there. Even when compiling without that option set, urboot.c automatically switches that option on if there is enough space left to do so without crossing a page boundary.

Yes. urboot knows about parts with 4-page writes and treats them correctly. Tested with the t1634. I don't know as much about AVR parts as you do, and 4-page-erase isn't mentioned in the ATDF files, so the best I could do to smoke out which parts suffer from that was

cd dir-with-some-data-sheets
for fn in $(find . -type f -iname \*.pdf); do
  pdftotext "$fn" - | grep -iq -e4.page.erase && echo "$fn"
done

This gave me t841, t441 and the t1634. AVRDUDE also needs to know about them, and the most recent one does

$ avrdude -p*/st | grep n_page_erase
.pt	ATtiny441	n_page_erase	4
.pt	ATtiny841	n_page_erase	4
.pt	ATtiny1634	n_page_erase	4

The reason AVRDUDE needs to consider n_page_erase is that it can program input files with "holes" and won't program full pages of the holes in the input file. So AVRDUDE must know the correct effective page size for SPM programming when serving a bootloader for these parts.

baller baud rate generator for LINUART

Dealt with. You will be pleased to know that the urboot.c source manages to select at compile-time the "best" LINLBT value between 8 and 63.

Status flags etc, see my answer here

@SpenceKonde
Copy link
Owner

Thanks - the trick as I understand for vector bootloaders is this: You must imagine that any page may fail to write. What you just described does not solve that problem, because tthe order that operations are performed must ensure that no failure of a write, event if it's preceeding erase succeeds, can break the device. It doesn't matter what you try to always erase-write there, because it is possible for erase to succeed and write to fail (I think a brownout is the cause of this)

The exact problem is:
The first page being programmed is page 0. page buffer filled and page erase write command given. Subsequent pages have remains of other sketches in them.
Page 0 is erased successfully.
Then something goes wrong, and no data is written. page 0 is now all 0xFF, regardless of what you tried to write. There is only one workaround

If you never erase the first page except after erasing all non-bootloader pages, in reverse order, such that page0 is the only one not erased, then it is safe to erase it: a failure in that case would leave it storing 0xFFFF for every instruction word in the first page - and since you erased all other pages, it would have 0xFFFF's all the way to the start of the bootloader. Hardware interprets 0xFFFF as SBIS r31, 7. so depending on what that register happened to hold, it would either process them one or two at a time. until it The beginning of all the bootloaders i've seen look like they'd get execution to the right place whether or not it arrived right in position, or arrived 1 word in.

@stefanrueger
Copy link

What you just described does not solve that problem

Correct, but then you haven't articulated that problem (brownout after page erase) in the queries to which I answered 😉. Now that you have, here a safe workflow wrt the brownout problem: The uploader needs to ask the bootloader to erase the chip before programming it. AVRDUDE does that, but optiboot ignores that. Urboot can be compiled so it emulates a chip erase; use only those urboot configurations with ce. Urboot chip erase starts from below the bootloader and erases top down. When a brownout or a reset otherwise happens during that time, the jump to the bootloader is still there. When urboot finally erases page 0 then a brownout or reset will still find the bootloader, as you say, owing to the (undocumented) 0xffff opcode behaviour. Then the uploader writes the sketch from page 0 onwards and whatever the uploader tells urboot, it will put an r/jmp to itself in there (provided it has that P vector protection flag).

@SpenceKonde
Copy link
Owner

SpenceKonde commented Mar 17, 2023

Oh good. (I'd argue that this is so fundamental to a bootloader that making that an optional feature is crazy - like a car for which the engine was an option, or the breaks - without that option, you don't have a car :-P) Optiboot as it stands on ATTinyCore is a toy bootloader, unfit for any practical purpose, because it bricks itself so readily.

"Does not occasionally brick itself at random (we were never able to figure out how the problem came about in the field. It they just dropped out and stopped calling home, and investigation shouwed that the first page had been erased,.

@stefanrueger
Copy link

Well, I consider urboot to be a kit car. You build what you need for your application.

Why not give urboot a whirl and see how it goes. For vector bootloaders you want those that have a capital P in the feature string (protect reset vector) and ce capability. Remember you still need a workflow where the uploader issues as CE before programming.

@stefanrueger
Copy link

just dropped out and stopped calling home, and investigation shouwed that the first page had been erased

Hmm, I don't see why that should happen during normal operation. Could it be that an application kinda calls the bootloader or jumps somewhere into its middle? So that may well be a bug in the application. Mind you, urboot cannot prevent an application to figure out where the SPM opcode is located in the bootloader and aggressively jump to that location. You need a certain level of benign environment as bootloaders are on some theoretical level inherently unsafe by design (they water down the Harvard model of separation of code and data space by their very definition).

@stefanrueger
Copy link

just dropped out and stopped calling home, and investigation shouwed that the first page had been erased

Thinking a bit more about this: So unless the user was doing OTA (does optiboot even have code for this?), the bootloader was not deliberately called, right? So must be that the application somehow jumped to somewhere in the bootloader (I am assuming here that the application doesn't explicitly use spm nor that it stores data in flash that are spm opcodes).

Wild jumps are easily done.

One common reason is not accounting for SRAM use. If the stack grows into the .data, .bss and heap area then any function/ISR that writes to a variable (that is stored outside the registers) can potentially alter the return address on the stack. And, hey presto, the application jumps somewhere unknown else when attempting to return. SRAM in small parts is notoriously small. Arduino libraries use SRAM gratuitously: for example driver classes have SRAM variables for the pin number etc. That is a solder-time constant, no need to put that in SRAM. Buffer sizes for UART I/O libraries are hard to control and default to large values. I have yet to see source code that reasons about SRAM usage, function call depth, size of local variables, ISRs prematurely allowing interrupts etc. I have yet to see applications that track free memory. To wit: that's why I think applications are often called sketches (b/c they are sketchy). And I have yet to see a test protocol that uses, say, 20% of SRAM in an initialised buffer to push .data, .bss and heap out in an attempt to see whether the application crashes. If it doesn't in all the obvious test cases and you then remove the 20%-SRAM test buffer in .data you have a reasonable assurance that SRAM use might be OK when deployed in anger.

Also, the smaller the flash the higher the probability that a wild jump is to a location that eventually visits the single one spm opcode in the bootloader. Hopefully there is only a single one spm opcode in the bootloader! Urboot uses one and only one spm opcode. I think optiboot has six spm opcodes scattered in its standard bootloader. Not only is this wasteful it also increases the chances of random jumps finding an spm quite considerably.

@mcuee
Copy link
Author

mcuee commented Oct 20, 2023

FYI, MegaCore, MightyCore, MiniCore, MajorCore, and MicroCore have now adopted Urboot.

@SpenceKonde
Copy link
Owner

This core is planned to get Urboot and 7.x, timeline for this is the next release, but the next release timeline is less fixed for this core than the two cores for the currently active product lines, which are seeing increasing popularity and user engagement and which are more personally useful to me. I've been testing AVRdude on DxCore (which would be the first to use it if it went in now; it's going to see a release before mTC) over the past couple of days though and was surprised and dismayed to find that 7.2 seems to have suffered a critical regression and can no longer upload to optiboot boards with it - this is a non-starter for impacted cores. I am in the process of writing this up as a number of new issues in your repo.

@MCUdude
Copy link
Contributor

MCUdude commented Oct 20, 2023

It would be great if you could report any Avrdude-related bugs/regressions if you find any. Our goal is to improve the tool, not break existing features while implementing new things. If Avrdude 7.2 has significant regressions, I can always just roll back to 7.1, which I have been using with Optiboot for the last six months, but also supports Urboot.

The Avrdude code base has gotten a house overhaul since 6.3, and there is always a need for testers. That's why I'd like to push the latest Avrdude version available.

@stefanrueger
Copy link

[7.2 can] no longer upload to optiboot boards

There have not been many changes for -c arduino. Difficult to give advice w/out an issue reporting details; you might strike lucky with -xnometadata -c urclock which is able to upload to optiboot. You might need to tell AVRDUDE if the bootloader can deal with EEPROM (-xeepromrw) and the size of the bootloader -xbootsize=....

@MCUdude
Copy link
Contributor

MCUdude commented Oct 23, 2023

I am in the process of writing this up as a number of new issues in your repo.

You might need to tell AVRDUDE if the bootloader can deal with EEPROM (-xeepromrw) and the size of the bootloader -xbootsize=....

Still waiting to hear what you think is broken. However, -c urclock needs to know how large the bootloader size is in order to prevent the bootloader from overwriting itself. Avrdude stores a bunch of pre-calculated hashes to quickly determine which bootloader the target is running, so the user doesn't have to "manually" specify the boot size. If this is the case, we can probably add the hashes for the ATTinyCore Optiboot files, given that these files won't change, and Urboot will be the preferred bootloader in the future.

avrdude ur_initstruct() error: bootloader might be optiboot 5.0? Please use -xbootsize=<num>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants