Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring the Orientation Mapping Code #944

Open
6 tasks
CSSFrancis opened this issue Jul 24, 2023 · 5 comments
Open
6 tasks

Refactoring the Orientation Mapping Code #944

CSSFrancis opened this issue Jul 24, 2023 · 5 comments

Comments

@CSSFrancis
Copy link
Member

CSSFrancis commented Jul 24, 2023

@din14970 I know that you had started some refactoring of the orientation mapping code but I would like to maybe pick some of that up.

Changes necessary for refactoring Orientation Mapping Code:

  • Speed up (and simplify) simulation generation
  • Add fast azimuthal_integration method using cupy Diffraction2D --> Polar2D
  • Use map function for orientation mapping
  • Add to cupy function
@din14970
Copy link
Contributor

Probably a good start! Though I think there's quite a bit more to do than that to make it like I envisioned it.

Clean up the doc strings (I think you have done quite a bit of that already)

Agreed though I think a change of API (which I think is necessary) would make this redundant.

Have the user set the dask backend outside of the function (This should add support for Dask-distributed and multi-gpu support)

Yes, I agree with this.

Make a generator class which handles both the signal and the simulations

Not sure what you mean by this. Just a container for storing them together?

What I personally had in mind and started doing before when I could find free time:

  • Refactor diffsims template library generation. Generating a template library is quite a chore and the implementation is single threaded, python-loopy and exceedingly slow (often the bottleneck for template matching). In order to create a library, one must do a bunch of weird imports from at least 3 different packages. The entire library can be simplified, optimized and yet generalized (e.g. for the case when a detector is not perpendicular to the beam.). My start was on my fork.
  • In Pyxem the "template matching function" should be decomposed into the essential and the rest. Now the function tries to do too much at the same time, while simultaneously being not flexible. How to do this concretely in the best way I can only figure out while I'm doing it.
  • Ensure that the new API can be "translated" to how the old function used to work, but deprecate the old function.

@CSSFrancis
Copy link
Member Author

Agreed though I think a change of API (which I think is necessary) would make this redundant.

Maybe this is something to think about

Make a generator class which handles both the signal and the simulations

Not sure what you mean by this. Just a container for storing them together?

Pretty much. That was the idea of the Generator class originally is that it would hold multiple objects. I have been trying to get away from them though and am more in favor of everything being a function of the Diffraction Signal class.

What I personally had in mind and started doing before when I could find free time:

  • Refactor diffsims template library generation. Generating a template library is quite a chore and the implementation is single threaded, python-loopy and exceedingly slow (often the bottleneck for template matching). In order to create a library, one must do a bunch of weird imports from at least 3 different packages. The entire library can be simplified, optimized and yet generalized (e.g. for the case when a detector is not perpendicular to the beam.). My start was on my fork.

Hmm yea probably. There is lots of good stuff there. We could use dask to make it run in parallel and it wouldn't be terribly hard to get it to work. The nice thing is that this doesn't really break the pyxem api, only the diff sims api so this part won't hold up a 1.0.0 release.

  • In Pyxem the "template matching function" should be decomposed into the essential and the rest. Now the function tries to do too much at the same time, while simultaneously being not flexible. How to do this concretely in the best way I can only figure out while I'm doing it.

Yea so what I envision is a that the entire process would be lazy from end to end. hyperspy/hyperspy#3148 Allows us to make markers which are lazy and so potentially you could fit the data and then just look at a couple of diffraction patterns to see how it does without computing more than a chunk or two. The trick is to make it a process where you only have to compute the full orientation map once or twice.

s # Lazy 4D STEM Data
template_lib # template library

match = s.orientation_map(template_library = template_lib, ...) 
match # lazy orientation map
match_markers = match.as_markers() # lazy markers 
s.plot() # plot lazily
s.add_markers(match_markers)  # add the lazy orientation map

That way you could continually run this chunk of code and adjust parameters while visualizing the data but you would only run the orientation mapping on a couple of chunks each time. Then when you are happy with the parameters you would just compute the whole thing.

The other thing would be something like a

Just looking at this function

def index_dataset_with_template_rotation(
    signal,
    library,
    phases=None,
    n_best=1,
    frac_keep=1.0,
    n_keep=None,
    delta_r=1.0,
    delta_theta=1.0,
    max_r=None, 
    intensity_transform_function=None, # this should be removed and the function called before
    normalize_images=False,
    normalize_templates=True,
    chunks="auto",  # "auto" rechecking by dask is particularly terrible so I would recommend against it
    parallel_workers="auto", # set outside using dask 
    target="cpu",
    scheduler="threads",  # set outside using dask 
    precision=np.float64, # 32 bit float is probably enough

It seems like things could be fair bit simpler as well. I also think that moving the function to using the hyperspy.map function would be quite a bit better long term. We could have to add a key in the map function like gpu=True which basically does the same thing here and transfers the data to gpu and then back. It's been on my list for a bit but I haven't gotten around to it.

  • Ensure that the new API can be "translated" to how the old function used to work, but deprecate the old function.

Yea we should do that :)

I think from my end I am going to add in a little better gpu support for the map function and once that is done we can think about how things will look then.

@hakonanes
Copy link
Member

hakonanes commented Jul 28, 2023

I have been trying to get away from them [generators] though and am more in favor of everything being a function of the Diffraction Signal class.

Also, one could think about new classes for the objects (arrays and other information) contained in the current classes.

For simulating geometrical (or kinematical) EBSD patterns with kikuchipy, we use a KikuchiPatternSimulator which stores an orix.crystal_map.Phase and a list of diffsims.crystallography.ReciprocalLatticeVector. The simulator has two main methods: (1) return a GeometricalKikuchiPatternSimulation given an EBSDDetector and one or more Rotation(s) and (2) calculate a kinematical EBSDMasterPattern in the stereographic projection. This structure has served us well for some years, now, and should be easy to extend with a method to the simulator if we want to say make it perform two-beam or dynamical simulations.

The nice thing is that this doesn't really break the pyxem api, only the diff sims api so this part won't hold up a 1.0.0 release.

Changes in diffsims doesn't affect pyxem only if the things returned by diffsims are the same. If what diffsims’ returns changes, this would necessitate changes in pyxem's API as well, right?

@CSSFrancis
Copy link
Member Author

So I've been thinking about how to best do this.

Because we already have a PolarDiffraction2D class I think it makes the most sense to:

1a. Convert a Diffraction2D signal to a PolarDiffraction2D signal. This accounts for Ewald sphere effects as well as can account for position shifts etc.
1b. Create a template library to match against. This occurs in diff sims using a CIF file and probably needs to be simplified a little bit.

  1. Call PolarDiffraction2D.template_match to match the template library with the library.
  2. Return an orix.crystalMap object

The action items for this would be:

  • Add template_match function to PolarDiffraction2D
  • Make it so template_match doesn't force the dask_backend
  • Clean up the TemplateLibrary creation in diffsims
  • Make template_match return a CrystalMap
  • Make a new example notebook.

@CSSFrancis
Copy link
Member Author

CSSFrancis commented Jan 29, 2024

Here is a little bit more detailed of a road map:

Ideally each of these points should be one PR if I do this properly...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants