Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.13: list.remove(x) performance degradation with missing x and expensive repr(x) #116792

Closed
hroncok opened this issue Mar 14, 2024 · 10 comments
Labels
3.13 bugs and security fixes performance Performance or resource usage type-bug An unexpected behavior, bug, or error

Comments

@hroncok
Copy link
Contributor

hroncok commented Mar 14, 2024

Bug report

Bug description:

Hello. When debugging a problem in sip which hanged and caused OOM kills when building PyQt5 and PyQt6 on Python 3.13, I found the following:

217f47d is the first new commit that introduces the hangs/kills.

The commit changes the exception message in list.remove(x) to include repr(x) in it.

sip has the following code: https://github.com/Python-SIP/sip/blob/6.8.3/sipbuild/generator/parser/parser_manager.py#L435-L438

                    try:
                        self._template_arg_classes.remove(klass)
                    except ValueError:
                        pass
  • klass is a complex dataclass member
  • the block is called many times in a loop
  • the ValueError exception message is never retrieved

When changing the code to:

                    if klass in self._template_arg_classes:
                        self._template_arg_classes.remove(klass)

The hang/kill does not happen.


Further investigating the problem, it became obvious that it is the repr() call that now implicitly happens when constructing the ValueError in list.remove() causes the hang. In fact, calling repr(klass) directly in the else branch of the if even on Python 3.12.2 causes the hangs/kills in sip-build as well.

Unfortunately, I don't have a smaller reproducer than building PyQt from sources on Python 3.13.

I don't know why repr of a dataclass eats all my RAM. Perhaps the dataclass is somehow recursive like in #116647 -- that might be a problem in dataclass repr implementation.

However, I opened this issue to report this:

Constructing the ValueError's exception message in list.remove(x) is now expensive when repr(x) is expensive. Could the message be perhaps lazily evaluated to avoid calling repr(x) when the message will never be shown, like in the trivial try-except case?

CPython versions tested on:

3.13

Operating systems tested on:

Linux

@hroncok hroncok added the type-bug An unexpected behavior, bug, or error label Mar 14, 2024
@hugovk hugovk added performance Performance or resource usage 3.13 bugs and security fixes labels Mar 14, 2024
@gaogaotiantian
Copy link
Member

So here's my two cents.

try ... except should be used when the exception is an exception. If the exception is a common expected case, then an explicit if check should be used. I think this applies to many cases, like whether to check if a key is in a dict, or just access it and catch an exception.

So if your code do this is a loop and the exception is hit frequently, then maybe an explicit check is due.

Making the message evaluate lazily has its own issues, for example, reference cycles. Imagine the list containing a frame object that has a reference to the exception itself. Also it definitely would make the code more complicated.

However, I agree this is not an ideal behavior. Constructing the repr of the object works for most of the internal data structures, but I don't think it looks pretty when the repr of the object is huge and multiline - that might be worse than the original message.

From my perspective, I think the discussion should be whether to always show the repr of the objectin the message - is it worth it to have enormous string representation in the exception message in some cases.

@hroncok
Copy link
Contributor Author

hroncok commented Mar 14, 2024

Whether asking for forgiveness or asking for permission is the better approach is probably out of scope here. Using if rather than try works for the sip-build case, but might not work for other code.

@encukou
Copy link
Member

encukou commented Mar 18, 2024

cc @corona10 @vstinner, author & reviewer of the change

@vstinner
Copy link
Member

Currently, Python has no way to evaluate lazily an error message. I propose to just revert the change: #116956

While it's nice to have a better error message, I don't think it's worth it to make raising an exception so slower.

By the way, IMO the root cause is more that repr() on a PyQt object can cause a OOM, it should not happen.

@vstinner
Copy link
Member

@hroncok: Impressive bisection work, congrats! :-)

@corona10
Copy link
Member

corona10 commented Mar 18, 2024

IIUC, it means that %R is unsafe at the other place for the same use case.

@hroncok
Copy link
Contributor Author

hroncok commented Mar 18, 2024

By the way, IMO the root cause is more that repr() on a PyQt object can cause a OOM, it should not happen.

I agree.

@swt2c
Copy link

swt2c commented Mar 25, 2024

I investigated the root cause a bit further in the case of the PyQt build OOM issue. It seems the issue is that the repr() of a dataclass is recursive (and this seems to be the documented behavior: https://docs.python.org/3/library/dataclasses.html), and for a very complicated dataclass containing many other dataclasses, this can result in a very large string. Turning off the repr generation in the dataclass also seems to work around the problem in the PyQt/sipbuild. Ie:

@dataclass(repr=False)
class WrappedClass:
    """ Encapsulate a wrapped C/C++ namespace/class/struct/union. """
    ...

@vstinner
Copy link
Member

Turning off the repr generation in the dataclass also seems to work around the problem in the PyQt/sipbuild. Ie:

You may report the issue to PyQt :-)

@vstinner
Copy link
Member

Issue fixed by the change commit f6cdc6b. I close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes performance Performance or resource usage type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

7 participants