Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Python garbage collection order differes and segfaults depending on whether there is a Process or not #12207

Open
sunethwarna opened this issue Mar 19, 2024 · 5 comments
Labels

Comments

@sunethwarna
Copy link
Member

sunethwarna commented Mar 19, 2024

Description
We found a strange behaviour in gc in python. If a kratos simulation is created without any processes, then the order of destruction in python is:

  1. The Kratos::Model
  2. The Kratos::SolutionStrategy
    This order is wrong, but it does not harm anyone since most of the destructors does not do anything with the Kratos::Model or 99.9% time, we have at least one process.

virtual ~StructuralMeshMovingStrategy()
{
Model& owner_model = mpmesh_model_part->GetModel();
std::string name = mpmesh_model_part->Name();
owner_model.DeleteModelPart(name);
}

But the StructuralMeshMovingStrategy is deleting a model part which was created by it as shown above, hence if no process is used, then this causes a seg fault in the release, and sometimes a proper error in debug (Not using a process is a valid practical case, where I set the BCs for MESH_DISPLACEMENT for the model part from some where else rather than using a Process).

But if we use a single process (a dummy one even), then the python gc destruction order is correct as below.

  1. The Kratos::Strategy
  2. The Kratos::Model

@matekelemen and I was trying to figure out why not having a process cause this change of gc order, but was not succesful. So we are looking for some help on figuring out why this happens.

I am attaching an example with a dummy process (1_correct_gc_collection_order) which doesn't thorw an error. The second example (2_incorrect_gc_collection_order) will seg fault in release, or throw a warning in debug "Attempting to delete non-existent modelpart : test_MeshPart").

It is only with MeshMovingApplication, we tested the same with StructuralMechanicsApplication as well. This same problem is there as well. But it is not noticable, because, StructuralMechanicsApplication strategies does not do anything to the model or model part in the destructors.

Scope

  • Kratos applications or Core (not sure where the error originates from)

To Reproduce
Run the two test cases. 1_correct_gc_collection_order will not give any troubles, but 2_incorrect_gc_collection_order will segfault in release.
gc_example.zip

Expected behavior
2_incorrect_gc_collection_order should not segfault.

Environment

  • OS: Manjaro
  • Branch: master
  • Python 3.11
@philbucher
Copy link
Member

I think this should be refactored anyway (the meshmoving strategies)

How does the ConnnectivityPreserveModeler handle this?
Basically we could use it in the MeshMovingStrat, instead of the custom code for creating the ModelPart.

For the time being I think one can safely remove this delete

@sunethwarna
Copy link
Member Author

sunethwarna commented Mar 19, 2024

@philbucher , The ConnectivityPreserveModeller does not delete any model parts it creates, so those are destrcuted when the Model is destructed. Hence, there won't be a problem.

I think this should be refactored anyway (the meshmoving strategies)

I thought the same, but I am curious why this is happening in the first place :)

@roigcarlo
Copy link
Member

roigcarlo commented Mar 19, 2024

We had a similar problem in MPI (btw can confirm this is happening).

The problem is that we do not have any mechanism implemented in Kratos to enforce the destroy order of objects, so, in the example you send and in the strategy python decides that model has priority.

For this particular case the only clean solution I see, without having to change how Kratos works underneath is to enable the model to be used as a context. Example (taking you code, not 100% sure it works):

def main():
    with Kratos.Model() as model:
        model_part = model.CreateModelPart("test")
        model_part.ProcessInfo[Kratos.DOMAIN_SIZE] = 3
    
        params = Kratos.Parameters("""{
                "problem_data": {
                    "echo_level"   : 0,
                    "parallel_type": "OpenMP",
                    "start_time"   : 0.0,
                    "end_time"     : 1.0
                },
                "solver_settings" : {
                    "domain_size"            : 3,
                    "echo_level"             : 0,
                    "solver_type"            : "structural_similarity",
                    "model_part_name"        : "test",
                    "compute_reactions"      : false,
                    "calculate_mesh_velocity": false,
                    "model_import_settings"              : {
                        "input_type"     : "use_input_model_part"
                    },
                    "time_stepping" : {
                        "time_step"       : 1.1
                    }
                }
        }""")
        mesh_moving_analysis = MeshMovingAnalysis(model, params)
        ReadModelPart("../solid", model_part)
        mesh_moving_analysis.Initialize()
        mesh_moving_analysis.Finalize()
        mesh_moving_analysis.Clear()
    # Going out of context here ensrues that mesh_moving_analysis is deleted before model

    print("Finished everything.")

@sunethwarna
Copy link
Member Author

sunethwarna commented Mar 19, 2024

Hmm.. thats a solution may be we can promote more throughout the Kratos? But this won't work on cases such as unit tests (or any other place), where we create the model somewhere else and read the model part once, and use startegies (That is where this problem orignated for me) :/ At the moment, i forcefully use del to force destruction order which is not nice :/

@matekelemen
Copy link
Contributor

Just here to mention that this behaviour is not unique to MeshMovingStrategy, but all strategies (or at least the ones used by structural mechanics).

But this won't work on cases such as unit tests (or any other place), where we create the model somewhere else and read the model part once, and use startegies

Not particularly nice, but you could just explicitly call Model.__enter__() in the unit test's setUp, and have a matching Model.__exit__() in tearDown. That said, I'd just rather fix the ownership relations of our bindings. I'm not entirely sure how to go about it though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants