New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gc: Support valgrind tracking of Python heap allocations #14196
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #14196 +/- ##
=======================================
Coverage 98.39% 98.39%
=======================================
Files 161 161
Lines 21200 21214 +14
=======================================
+ Hits 20860 20874 +14
Misses 340 340 ☔ View full report in Codecov by Sentry. |
Code size report:
|
Looks interesting. I once used MTuner to look at the microPython heap; it's been years but IIRC it was quite interesting and useful as well, also because with the tagging you could highlight which memory is what. Hooking it is done calling functions similar to valgrind's. Hence a request: how about making memory profiling in gc.c generic, simply by name, i.e. keep everything as it is now but instead of
Or maybe that should even be a separate include file so that if one wants to use memory tagging for debugging, which essentially means adding MP_MEMPROFILER_TAG(some_piece_of_mem, "tag1") in a bunch of locations, they can do that by including that same file. |
Thanks very interesting, thanks @stinos! I hadn't seen MTuner and I agree it could be a very useful tool for some MicroPython analysis.
I would quibble with calling valgrind's memcheck a profiler, it's more of a runtime correctness checker (i.e. closer to tools like ASan or UBSan than tools like MTuner). This is kind of a philosophical point, but kind of not because there's a bunch of places in this PR where we have to tell valgrind "we're about to access some memory that's supposed to be inaccessible now, but it's not an error" (for example, see around gc.c:868). This concept doesn't really have an equivalent when doing memory profiling. That said, I do think it's a good idea to make the macros generic if possible. I will take a look and see how well they can be made to fit. I really like the idea of making it easier to integrate multiple different tools with the MicroPython core, while keeping the main source files pretty clean. |
Actually maybe this is less complex than I thought, can probably collapse down to something like:
Will try this soon, thanks! |
Sure; I didn't give the naming a lot of thought but indeed |
Allows doing "export STRIP=" in a profile or .envrc file, to not strip binaries by default. Signed-off-by: Angus Gratton <angus@redyak.com.au>
This seems to work quite well, can run individual test cases under valgrind and (although slow) it shows any memory safety issues. Replaces the valgrind false-positives fix previously applied in 8407159. That one effectively disabled gc roots from being marked, so things would be garbage collected even when there were valid references to them. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Angus Gratton <angus@redyak.com.au>
This option prevents the Python heap from reusing freed memory addresses for new allocations (obviously only viable for some workloads!) Advantage is that valgrind will be able to catch any use-after-free, that might have been missed if the freed memory address was reallocated for another use. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
02e3968
to
e273f6d
Compare
Extends Valgrind memcheck support in the unix port to track the Python heap, allowing Valgrind to report use-after-free and (in some cases) out of bounds reads and writes when accessing Python heap memory.
Valgrind support is implemented in a dedicated header, to try and keep gc.c relatively clean.
Also adds an additional valgrind build configuration:
MICROPY_DEBUG_VALGRIND_MAX_ADDRSPACE
. When set, the heap will never reuse the address ranges of freed memory - new allocations are always at a higher address than any previous allocation. Assuming there's enough total heap, this means that use after free errors can never be masked by the freed address being already reallocated for something else.This branch was used when reproducing and verifying #14029, and seems like it will be useful for other purposes as well.
TODO
gc_get_ptr()
that was added in gc.c: Ensure a gap of one byte before the finaliser table. #7722 (8407159). I need to verify but this logic appears to prevent the GC marking any "pointer" regions if valgrind is enabled. The side effect seems to be that objects retained on the stack (like the running bytecode!), and root pointers, are being garbage collected prematurely. I think we can implement this differently to still avoid warnings if the GC reads uninitialised stack words, but without incorrect GC behaviour. EDIT: Done, have implemented this in a different way so normal gc scans can continue.Limitations