Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simply allocating memory causes hard crash & board freeze on code re-upload #9162

Closed
raquo opened this issue Apr 10, 2024 · 7 comments
Closed
Assignees
Milestone

Comments

@raquo
Copy link

raquo commented Apr 10, 2024

CircuitPython version

Adafruit CircuitPython 9.0.2 on Adafruit ItsyBitsy M4 (tested on two boards)

Code/REPL

import gc

mem_blocks = []

for i in range(0, 1):
    mem_blocks.append(bytearray(10_000))
    print("FREE MEM after BIG block {}: {}".format(i, gc.mem_free()))

for i in range(0, 70):
    mem_blocks.append(bytearray(1_000))
    print("FREE MEM after block {}: {}".format(i, gc.mem_free()))

print("INIT DONE")

ix = 0

while True:
    if ix % 10_000 == 0:
        print(gc.mem_free())
    ix += 1

Behavior

When this code is first uploaded to the board, it works as expected, and prints the expected free memory in bytes:

code.py output:
FREE MEM after BIG block 0: 139824
FREE MEM after block 0: 138672
FREE MEM after block 1: 137536
...
FREE MEM after block 68: 69024
FREE MEM after block 69: 67888
INIT DONE
67776
67776
67776
...

As you see, CircuitPython reports 67K bytes of memory left.

Now, as the board is running, if you make any code change (e.g. change "INIT DONE" to "INIT DONE!") and upload it to the board, the board freezes and becomes unresponsive, with symptoms similar to what I described in #9138. Specifically:

  • The board's serial output in Mu freezes ("67776" is the last output, no "code reloading" message, no error messages)
  • The CIRCUITPY drive becomes unresponsive, the board's USB breaks, it glitches other USB devices, etc.
  • The board's LEDs go dark
  • The CIRCUITPY drive is unmounted after ~20 seconds

As with #9138, the code upload fails (old code remains on the board), and now I have a board that I can't upload any code to (the bug is triggered by the code currently on the board, not by the code being uploaded). So every time I run into this, I have to erase the board's filesystem and re-upload the code.

Description

No response

Additional information

Reproduction variations:

  • Reproduction as shown is 100% reliable for me, tested on two boards.
  • Instead of allocating 1 x 10K block and 70 x 1K blocks, you can also just allocate 120 x 1K blocks to trigger the bug. This requires using 40K more memory than the original recipe, so it seems that allocating a single big 10K memory block has an outsized impact on triggering the problem.
  • Also, if you move the allocation of the 1 x 10K block to happen after the allocation of 70 x 1K blocks, you will actually need to increase the number of 1K blocks to 100 before the bug is triggered, so it seems that allocating a big 10K memory block early in the program has an outsized impact on triggering the problem.
  • Reducing the total amount of allocated memory prevents the bug from being triggered
  • Reloading the board with Ctrl+D in Mu does not trigger the bug, only uploading the code does.

I have a sizeable CircuitPython project that is reliably triggering this bug on every code upload. Specifically, it's my use of audiomp3.MP3Decoder that triggers the problem. I am instantiating MP3Decoder relatively early in the program as recommended here, and it allocates around 35K of memory, of which at least one 9K block is contiguous. I don't even get to playing any audio, it's just this one instantiation that triggers the bug.

I wanted to make a small reproduction without any dependencies, and it turned out that simply allocating memory is enough, so here we are.

Of course, I understand that memory is finite, but:
a) The reproduction program is very simple, and only uses 80K memory, and gc.mem_free() still reports more than 60K of available memory, so I should not be running out.
b) Even if memory fragmentation is an issue, I expect a MemoryError at runtime instead of a hard crash during code upload, that puts the board in a hard-to-recover state

@raquo raquo added the bug label Apr 10, 2024
@dhalbert
Copy link
Collaborator

What kind of host computer are you using and what is its version of its OS? There are issues with macOS Sonoma.

@dhalbert dhalbert added this to the 9.0.x milestone Apr 10, 2024
@tannewt
Copy link
Member

tannewt commented Apr 10, 2024

  • The CIRCUITPY drive becomes unresponsive, the board's USB breaks, it glitches other USB devices, etc.

This can happen if CP hangs and stops running the USB task. Definitely a severe bug.

@raquo
Copy link
Author

raquo commented Apr 10, 2024

@dhalbert Originally I tested this on MacOS Monterey 12.7.4 on an 2018 Intel Mac Mini.

To be sure, I've just reproduced this on Windows 10 Home (running natively on a 2012 Intel Macbook Pro, not virtualized), and the bug triggers the same there, with all the same symptoms including the code failing to upload and the CIRCUITPY drive going unresponsive. The only apparent difference is that other USB devices (e.g. mouse) don't glitch when this happens, I guess this computer might have better isolation of USB ports, or something like that.

@dhalbert
Copy link
Collaborator

dhalbert commented Apr 11, 2024

@tannewt Just a guess, but I am wondering if this is due to allocate_ram_cache() in supervisor/external_flash.c being unable to allocate cache space. There are fallback actions, but maybe something is not working right. I should spin up a debug build and see if I can catch the hang.

EDIT: yes, working on fixes in there.

@dhalbert
Copy link
Collaborator

@raquo Thank you very much for the "always fails" test program. That was very helpful in debugging and in testing my fixes.

@raquo
Copy link
Author

raquo commented Apr 11, 2024

@dhalbert My pleasure, thank you for the fixes! :)

@dhalbert
Copy link
Collaborator

Fixed by #9169.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants