Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[webassembly][pyscript] Unable to process over 88Kb of code #14307

Open
2 tasks done
WebReflection opened this issue Apr 15, 2024 · 5 comments
Open
2 tasks done

[webassembly][pyscript] Unable to process over 88Kb of code #14307

WebReflection opened this issue Apr 15, 2024 · 5 comments
Labels

Comments

@WebReflection
Copy link

Checks

  • I agree to follow the MicroPython Code of Conduct to ensure a safe and respectful space for everyone.

  • I've searched for existing issues matching this bug, and didn't find any.

Port, board and/or hardware

webassemply

MicroPython version

latest for pyscript

Reproduction

We bootstrap our virtual env via FS operations and we port that whole operation via strings (as in Python strings) but we reached a memory limit so that everything breaks. Under 60Kb of single run we don't have issues but beyond that it's breaking.

Expected behaviour

MicroPython should be able to run up to 256Kb of code or more to me or we should be able to raise that limit. We tried to explicitly increase the heapsize without any success.

Observed behaviour

runPython(moreThan88KCode) fails, it doesn't on Pyodide, it doesn't if we manually remove some line to evaluate from that code.

Additional Information

This file contains our bootstrap logic which is automatically generated and it travels across workers or projects to be sure our namespace is up and running by the time users' code executes.

It served us well as logic to date and we never had issues with MicroPython but a recent branch that would like to bring more is failing hard on it.

micropython-limit.txt

@WebReflection
Copy link
Author

WebReflection commented Apr 15, 2024

This file, once renamed as .html will show the limit. This is without PyScript module around.

Remove the line line 47 and see no error ... please note: you need to remove that line, as even commenting it, will still show the memory out of bound error, just to be clear it's not the code the issue, rather how much input one can run.

micropython-limit.html.txt

@WebReflection
Copy link
Author

WebReflection commented Apr 16, 2024

@dpgeorge I don't know if this is useful but in this line maybe there's not enough memory allocated ... there's a code reference and maybe we can scale up that allocation accordingly to its length?

const value = Module._malloc(3 * 4);
but I also wonder if this should be JSFLAGS += -s ALLOW_MEMORY_GROWTH=1 instead although it seems to work fine to date.

edit in this page https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#access-memory-from-javascript they suggest this instead which reasons well to me:

var buf = Module._malloc(myTypedArray.length*myTypedArray.BYTES_PER_ELEMENT);

we don't have a typed array but JS is UTF-16 so that _maloc could instead be:

var buf = Module._malloc(code.length * Uint16Array.BYTES_PER_ELEMENT);

This seems like less hard coded although the difference between that value and 3 * 4 (12) is huge so maybe there's a better threshold to consider, still based on input length, not on "magic numbers", still assuming that's the culprit.

@dpgeorge
Copy link
Member

Thanks for the report.

The problem is an overflow of the Emscripten C stack. When using Module.ccall(...) and passing an argument (from JS to C) as a type "string", Emscripten will use stringToUTF8OnStack() to allocate space for the string on the C side, on the C heap. This is nothing to do with MicroPython, it's internal Emscripten behaviour.

When passing really big strings from JS to C, the allocation for this string on the C heap fails. That's the error that you see.

The fix is for MicroPython to explicitly allocate memory for the string that's to be passed from JS to C, and to use stringToUTF8() instead of stringToUTF8OnStack().

@WebReflection
Copy link
Author

WebReflection commented Apr 16, 2024

This is nothing to do with MicroPython, it's internal Emscripten behaviour.

I trust your opinion and experience but we don't have the issue on Pyodide so maybe they do that behind the scene already? If you find a single minute, and a single minute is enough to explain, please expand on that 3 * 4 Module._malloc as it buffles me why that's needed, although maybe it's just to allocate enough memory for those pointers used then via ccall(...)?

Thanks!

Despite that clarification, anything we can do on our side to prevent that error from happening? 🤔

edit any reason me playing around heapsize never produced a different result? not super important, just trying to understand when we want to eventually increase the default heapsize.

@dpgeorge
Copy link
Member

If you find a single minute, and a single minute is enough to explain, please expand on that 3 * 4 Module._malloc as it buffles me why that's needed, although maybe it's just to allocate enough memory for those pointers used then via ccall(...)?

That allocation is for the proxy to pass back the Python result from C to JS. It's a small 3-word array that contains the type of the proxy as the first entry (eg int, str, list, dict, function, etc) and then additional information in the other two entries.

I should make these magic numbers constants...

Despite that clarification, anything we can do on our side to prevent that error from happening?

Probably not. The only suggestion is not to pass large strings to runPython!

any reason me playing around heapsize never produced a different result?

Because it's a stack overflow, not running out of heap. Apparently the Emscritpen stack size can only be increased when building via the -sSTACK_SIZE=xxx option.

dpgeorge added a commit to dpgeorge/micropython that referenced this issue Apr 17, 2024
Otherwise Emscripten allocates it on the Emscripten C stack, which will
overflow for large amounts of code.

Fixes issue micropython#14307.

Signed-off-by: Damien George <damien@micropython.org>
dpgeorge added a commit to dpgeorge/micropython that referenced this issue Apr 24, 2024
Otherwise Emscripten allocates it on the Emscripten C stack, which will
overflow for large amounts of code.

Fixes issue micropython#14307.

Signed-off-by: Damien George <damien@micropython.org>
dpgeorge added a commit to dpgeorge/micropython that referenced this issue Apr 24, 2024
Otherwise Emscripten allocates it on the Emscripten C stack, which will
overflow for large amounts of code.

Fixes issue micropython#14307.

Signed-off-by: Damien George <damien@micropython.org>
dpgeorge added a commit to dpgeorge/micropython that referenced this issue Apr 24, 2024
Otherwise Emscripten allocates it on the Emscripten C stack, which will
overflow for large amounts of code.

Fixes issue micropython#14307.

Signed-off-by: Damien George <damien@micropython.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants