Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Speed up of deepcopy for Miller #461

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

CSSFrancis
Copy link
Member

Description of the change

Speeds up deep copy for Miller class. Current deep copy is slowed down by using the deepcopy function.

This causes issues in diffsims when we have to repeatedly copy and rotate some ReciprocalLatticeVector

For reviewers

  • The PR title is short, concise, and will make sense 1 year later.
  • New functions are imported in corresponding __init__.py.
  • New features, API changes, and deprecations are mentioned in the unreleased
    section in CHANGELOG.rst.
  • Contributor(s) are listed correctly in __credits__ in orix/__init__.py and in
    .zenodo.json.

@hakonanes
Copy link
Member

hakonanes commented Nov 2, 2023

Thank you for looking into making deepcopying faster.

Your solution only deepcopies the vectors and not the phase, which is not in line with the current behavior. The phase should not be shared. If deepcopying the phase is a bottleneck we should look into improving the speed of this method instead.

@hakonanes
Copy link
Member

I realize now that copying only the data is what you want in the case you describe... Are you happy with a parameter deepcopy(data_only=False)? If true, your current solution is used. It is false by default, retaining the current behavior.

@CSSFrancis
Copy link
Member Author

@hakonanes let me look into this more. I think deepcopying only the data is a good solution but I'll look into the phase deepcopy as well.

@hakonanes
Copy link
Member

A sidenote: I've considered replacing deepcopy() with just copy() (and deprecating deepcopy() one minor release). The behavior is the same. But the naming is in line with NumPy's interpretation of a "copy" (not shared memory) and a "view" (shared memory). What do you think?

@harripj
Copy link
Collaborator

harripj commented Nov 29, 2023

@CSSFrancis @hakonanes thanks for looking into this!

I think the deepcopy API should return a deep copy of all relevant data in this case. Whilst this PR may lead to speedups, I think this could be a footgun for the majority of users who would not expect any memory to be shared when using the API.

Are there other examples in other codebases where deepcopy returns partial copies?

If the speed ups are useful in diffsims then I think this PR should be an internal optimisation rather than in orix, such that the copying scope is well-defined and controlled, ie. it is known that phase is shared memory.

@hakonanes
Copy link
Member

Thank you for your input, @harripj. I agree with you on all points. The deepcopy method is meant to be used by end users or in the setup of more demanding methods, not repeatedly inside a loop or similar.

But if we can get any speed up that would be nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants