Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to update passive swarm variables #166

Open
bknight1 opened this issue Mar 5, 2024 · 12 comments
Open

Unable to update passive swarm variables #166

bknight1 opened this issue Mar 5, 2024 · 12 comments
Assignees

Comments

@bknight1
Copy link
Member

bknight1 commented Mar 5, 2024

UW is producing an index error when running on multiple processors:

Traceback (most recent call last):
  File "/Users/benknight/Documents/Research/Modelling/UW-models/UW3-dev/passive_swarm_issue.py", line 64, in <module>
    test_ps.data[:,0] = 1.
  File "/Users/benknight/Documents/Research/GitHub/underworld3/src/underworld3/swarm.py", line 1513, in __exit__
    var._update()
  File "/Users/benknight/Documents/Research/GitHub/underworld3/src/underworld3/swarm.py", line 278, in _update
    self._rbf_to_meshVar(self._meshVar)
  File "/Users/benknight/Documents/Research/GitHub/underworld3/src/underworld3/swarm.py", line 302, in _rbf_to_meshVar
    Values = self.rbf_interpolate(new_coords, verbose=verbose, nnn=nnn)
  File "/Users/benknight/Documents/Research/GitHub/underworld3/src/underworld3/swarm.py", line 386, in rbf_interpolate
    values = kdt.rbf_interpolator_local(new_coords, D, nnn, verbose)
  File "src/underworld3/kdtree.pyx", line 231, in underworld3.kdtree.KDTree.rbf_interpolator_local
IndexError: index 4607182418800017421 is out of bounds for axis 0 with size 2
[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly illegal memory access
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[2]PETSC ERROR: to get more information on the crash.
[2]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.

I've attached a basic script to replicate the issue.

Cheers
passive_swarm_issue.txt

@lmoresi
Copy link
Member

lmoresi commented Mar 5, 2024

I checked in a sort-of fix (setting nodal-point value to zero when no particles are found). I'm assuming no-particles-found is the error condition, but I haven't checked if it is actually a small number that is required to build the kd-tree consistently.

We did not have a check for this case: Sparse swarm which does not have particles in every process but still trying to build a proxy mesh variable. In general, this is not really a well-defined situation. We should probably have some value that can be set as the default value where no particles are nearby. Alternatively, refuse to build a proxy.

I'll leave the issue open for discussion of a proper fix.

@gthyagi
Copy link
Contributor

gthyagi commented Mar 7, 2024

I was doing something similar to this in #142. I am closing #142. We can track this one instead.

@bknight1
Copy link
Member Author

bknight1 commented Mar 7, 2024

@gthyagi slightly different issue as the fixes Louis added won't stop the error you posted from occurring

@gthyagi
Copy link
Contributor

gthyagi commented Mar 8, 2024

@bknight1, I agree.
I referred another issue here because these issues fall under a broader category known as "the scenario where a processor contains zero particles." The existing built-in particle functions consistently generate errors when executed in parallel under this circumstance.

For users transitioning from uw2, tasks involving particles may appear straightforward. However, in uw3, all such operations become nontrivial.

@lmoresi
Copy link
Member

lmoresi commented Mar 8, 2024

I'm not sure what we can do about this - underworld2 would return indices of particles added to the swarm so you could tell that there had been a migration but you could not always be sure you would still have particles on any given process.

We should think about how this is to be done to minimise the confusion in parallel.

I agree that this becomes complicated quite quickly and it always seems to be difficult but I thought this was the case in uw2 as well.

L

@bknight1
Copy link
Member Author

bknight1 commented Mar 8, 2024

@bknight1, I agree. I referred another issue here because these issues fall under a broader category known as "the scenario where a processor contains zero particles." The existing built-in particle functions consistently generate errors when executed in parallel under this circumstance.

For users transitioning from uw2, tasks involving particles may appear straightforward. However, in uw3, all such operations become nontrivial.

I'm not sure I agree, what you were trying to do would still produce an error in UW2 too

@knepley
Copy link
Collaborator

knepley commented Mar 8, 2024

@lmoresi We can number the particles if that is something Underworld needs. I have had at least one other request. It is the same as adding the "cellid" field, but it would be "particleid".

@bknight1
Copy link
Member Author

@knepley I think particle numbering would be useful for us

@knepley
Copy link
Collaborator

knepley commented Mar 18, 2024

It is already doable. Add an int field and set it to an id when you create the particles.

@lmoresi
Copy link
Member

lmoresi commented Mar 18, 2024 via email

@knepley
Copy link
Collaborator

knepley commented Mar 18, 2024 via email

@lmoresi
Copy link
Member

lmoresi commented Apr 3, 2024

Agreed - we should do this ourselves while we wait for this to propagate into PETSc itself. @julesghub - feel free to assign this to somebody other than me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants