Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate bug report from D. K. Smith #773

Open
adamkewley opened this issue Sep 8, 2023 · 8 comments
Open

Investigate bug report from D. K. Smith #773

adamkewley opened this issue Sep 8, 2023 · 8 comments

Comments

@adamkewley
Copy link
Collaborator

adamkewley commented Sep 8, 2023

Received in private email (search: "opensimcreator project Smith")

User reports that OSC is crashing whenever he loads his model file in it. And that the issue did not exist in v3.2 (assuming he means 0.3.2).

@adamkewley
Copy link
Collaborator Author

adamkewley commented Sep 8, 2023

Private (research) model requested and received. I'll hold this in private storage as issue_773_source_model.osim.

@adamkewley
Copy link
Collaborator Author

The model was sucessfully loaded with no crash or error in the following configurations:

  • MacBook Air, Ventura 13.5, OpenSim Creator development version
  • MacBook Air, Ventura 13.5, OpenSim Creator 0.5.1
  • MacBook Air, Ventura 13.5, OpenSim Creator 0.5.0
  • MacBook Air, Ventura 13.5, OpenSim Creator 0.3.2
  • Windows 11, OpenSim Creator development version
  • Windows 11, OpenSim Creator 0.4.1
  • Windows 11, OpenSim Creator 0.3.2
  • Ubuntu 22, OpenSim Creator development version
  • Ubuntu 22, OpenSim Creator 0.5.0

So I cannot reproduce the error with the given model file - I'll ask him for more information.

@adamkewley
Copy link
Collaborator Author

adamkewley commented Sep 12, 2023

One part of his bug was fixed in 16f46e7 but it wasn't tagged with an issue number because I noticed it in passing.

@adamkewley
Copy link
Collaborator Author

I have received a file that crashes from DKS. Details are:

Tested against:

Version Result
main Broke
0.5.2 Broke
0.4.1 Broke
0.4.0 Broke
0.3.2 Broke
0.2.0 Broke
0.1.6 Working
0.1.0 Working

So there might be hint about how to fix it in the diff between 0.1.6 and 0,.2.0

@adamkewley
Copy link
Collaborator Author

adamkewley commented Sep 17, 2023

The reason why 0.1.6 works is because it doesn't go through the step of re-finalizing connections:

It subsequently crashes if a connection-finalizing operation is performed later in the editing cycle (e.g. change the name, no crash; then change a joint, crash - but the crash is related to finalizing connections).

@adamkewley
Copy link
Collaborator Author

I created a simplified version of the model graph to visually inspect where exactly the cycle is in this particular model:

image

It looks like pectoral_girdle_b is attached to ground via two "directions". I.e. sometimes it's the parent, sometimes it's the child, in a mobilizer relationship. This is accepted by SimTK, but creates a graph cycle, which leads to OpenSim generating slave bodies, which leads to this particular bug.

@adamkewley
Copy link
Collaborator Author

3b24088 adds an even simpler bug reproduction, which is a file containing only two bodies and three joints.

The simplest possible failing version would be a single body and two joints, but that is co-incidently caught by opensim-org/opensim-core#3299


Now that I have delved into the segfault debug trace, reproduced the bug in a way that's independent of mesh data, muscles, etc. I think it is safe to conclude that this particular bug is related to how OpenSim's sockets retain pointers for longer than needed.

What's happening here is:

  • Connections are finalized
  • Which, because the model contains a cycle, automatically creates a slave body
  • And the necessary socket connections etc. to that slave are created
  • Then the connections are re-finalized in some cases by OSC, e.g. because re-naming a component in the model benefits from following socket pointers (rather than name strings) in order to get the new name
  • But re-finalizing connections does the whole graph traversal dance, which deletes the old slave and creates a new slave in a different memory location.
  • After doing that, it then goes through all the sockets, re-finalizing them
  • But re-finalizing sockets checks "does the socket currently have a pointer (it does), follow that pointer for socket-related maintenance and then write out the latest name (the behavior we want)
  • But the pointers held in the sockets of anything that was connected to the first version of slave are stale--the original slave has been deleted--segfault.

So the "hollistic" solutions to this (in OpenSim) are:

  • All sockets in components that point to slave elements are proactively cleared of pointers before graph traversal - this requires a change in OpenSim because only OpenSim internally knows which components are "adopted", etc.
  • OR: If graph traversal wants to make a slave, check if it has already been created first, rather than proactively wiping out all slaves and re-creating them (unsure how easy it is to track)
  • OR: Stop using pointers in sockets (might be slow)
  • OR: Add runtime lifetime checking to sockets, so that stale pointers are caught/reprocessed at runtime, rather than being assumed to always be correct (Add runtime checking for OpenSim::ComponentSocket<T> opensim-org/opensim-core#3533)

@adamkewley
Copy link
Collaborator Author

Non-"hollistic" solutions include:

  • Monkey-patching OSC to include tricks like having its own socket-correcting pass when renaming things in the model

But that comes with its own baggage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant