Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several sources of undefined behavior in type casters #549

Merged
merged 2 commits into from
May 22, 2024

Conversation

oremanj
Copy link
Contributor

@oremanj oremanj commented Apr 27, 2024

  • Some cast operators can raise exceptions, but many container casters would call the cast operator in their noexcept from_python function. Fix by adding a new can_cast<T>() caster method that returns false if operator cast_op<T>() would throw an exception. Note that implementing this properly for unique_ptr required some changes to the nb_type_relinquish_ownership API, mostly in the creation of a new nb_type_restore_ownership function to reverse its effects. Fixes [BUG]: 'wrapper' type casters can crash by raising exceptions through noexcept #497.

  • In nb::cast and nb::try_cast, we were previously deallocating the cleanup list before initializing the result object, which is generally UB when implicit conversions are involved.

  • Avoid creating dangling pointers/references in container type casters, using a new flags_for_local_caster() function that casters call when they're delegating to an inner caster. There are two ways this would happen in the past:

    • Pointers to bound types created using implicit conversions would dangle once the cleanup_list was destroyed, which is problematic for nb::cast(). This is resolved by removing convert from the inner caster's flags if the manual flag is set.
    • Pointers to custom-caster types usually refer into the type caster object, so they would dangle if the caster were stored on the stack, which is a common idiom for containers. This is resolved by static_asserting in flags_for_local_caster that we're not casting to T* unless T is a bound type.

@oremanj
Copy link
Contributor Author

oremanj commented Apr 27, 2024

I'm not sure yet what's going on with the failing tests - all of them are because the destruction of one of the Example objects is being delayed past the end of the test. It happens on different builds each time I run, but consistently 2-3 of them. Haven't been able to reproduce it locally yet. Adding an extra collect() didn't help.

@wjakob
Copy link
Owner

wjakob commented May 7, 2024

Hi @oremanj, just to keep you posted: I'm currently swamped with a deadline (in ~2 weeks), and then I'm moving to another country. I won't have time to debug the test issue in a near timeframe.

@wjakob wjakob force-pushed the master branch 2 times, most recently from 2e640d4 to 3a4119f Compare May 22, 2024 08:21
wjakob added a commit that referenced this pull request May 22, 2024
wjakob added a commit that referenced this pull request May 22, 2024
wjakob added a commit that referenced this pull request May 22, 2024
This commit splits up a large testcase into smaller ones, which fixes
a weird GC issue that seems to be specific to Python on Windows
(instances being collected too late, which makes a test flaky).
@wjakob
Copy link
Owner

wjakob commented May 22, 2024

I looked into the GC issue and was eventually able to reproduce it on a windows machine. As far as flakes go, it seems relatively benign -- some instances not being collected by the time they're supposed to, but then they do get deleted later on.

It's weird, as if something was wrong with the GC. Or somebody else is holding further references. Perhaps it's a combination of the GC and the extra rewriting that Pytest does to capture exceptions and warnings with extra contexts to provide readable output. But then it's weird that this only seems to happen on Windows.

The issue goes away simply by refactoring the big test into smaller ones. So I will leave it at this for now.

@wjakob
Copy link
Owner

wjakob commented May 22, 2024

This PR was merged out of band in commit c36584e. (I officially merged the PR but will force-push the tweaked version. Just in case it plays a role for the bean counting somewhere ;))

Thanks for your work on this, it's a great improvement.

@wjakob wjakob merged commit 1861447 into wjakob:master May 22, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]: 'wrapper' type casters can crash by raising exceptions through noexcept
2 participants