Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ptera crashes on Gracehooper #216

Open
Delaunay opened this issue Apr 19, 2024 · 5 comments
Open

ptera crashes on Gracehooper #216

Delaunay opened this issue Apr 19, 2024 · 5 comments

Comments

@Delaunay
Copy link
Collaborator

dlrm.0 [overseer_error] {'message': 'weakly-referenced object no longer exists', 'type': 'ReferenceError'}
dlrm.0 [stderr] ================================================================================
dlrm.0 [stderr] voir: An error occurred in an overseer. Execution proceeds as normal.
dlrm.0 [stderr] ================================================================================
dlrm.0 [stderr] Traceback (most recent call last):
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/voir/phase.py", line 259, in _step_one
dlrm.0 [stderr]     next_phase = gen.send(ph.value)
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/voir/tools.py", line 142, in wrapped
dlrm.0 [stderr]     yield from fn(ov, getattr(ov.options, argname))
dlrm.0 [stderr]   File "/home/delaunay/milabench/benchmarks/dlrm/voirfile.py", line 50, in instrument_main
dlrm.0 [stderr]     (ov.probe("//run > L").throttle(1)["L"].map(float).give("loss"))
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/voir/overseer.py", line 178, in probe
dlrm.0 [stderr]     return self.require(ProbeInstrument(select(selector, skip_frames=1)))
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/ptera/selector.py", line 845, in select
dlrm.0 [stderr]     rval = _resolve(sel, env, count())
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/ptera/selector.py", line 771, in _resolve
dlrm.0 [stderr]     el = _resolve(selector.element, env, cnt)
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/ptera/selector.py", line 797, in _resolve
dlrm.0 [stderr]     name = _eval(selector.name, env)
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/ptera/selector.py", line 700, in _eval
dlrm.0 [stderr]     return x.eval(env)
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/ptera/selector.py", line 641, in eval
dlrm.0 [stderr]     return dict_resolver(env)(x)
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/ptera/selector.py", line 574, in resolve
dlrm.0 [stderr]     import codefind
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/codefind/__init__.py", line 5, in <module>
dlrm.0 [stderr]     code_registry.setup()
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/codefind/registry.py", line 38, in setup
dlrm.0 [stderr]     self.collect_all()
dlrm.0 [stderr]   File "/home/delaunay/env/lib/python3.10/site-packages/codefind/registry.py", line 45, in collect_all
dlrm.0 [stderr]     if isinstance(obj, types.FunctionType):
dlrm.0 [stderr] ReferenceError: weakly-referenced object no longer exists
dlrm.0 [stderr] ================================================================================
@Delaunay
Copy link
Collaborator Author

@breuleux Do you know what could be causing this ?

@Delaunay
Copy link
Collaborator Author

I added this

    def collect_all(self):
        # Collect code objects
        results = []
        for obj in gc.get_objects():
            try:
                if isinstance(obj, types.FunctionType):
                    results.append((obj, obj.__code__))
                elif getattr_static(obj, "__conform__", None) is not None:
                    for x in gc.get_referents(obj):
                        if isinstance(x, types.CodeType):
                            results.append((obj, x))
            except ReferenceError:
                print("Reference error, object got deleted")

Seems to help things moves along.

Some of the objects returned by get_objects() got GC'ed a bit before of while the iteration was going

@breuleux
Copy link
Member

@Delaunay I hadn't seen that error before, but the diagnosis/fix makes sense. Basically this? https://github.com/breuleux/codefind/pull/2/files

@Delaunay
Copy link
Collaborator Author

Yes, that is it!

@breuleux
Copy link
Member

Fix is included in codefind v.1.0.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants