Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add grad chunk size #1590

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft

Conversation

chrisrothUT
Copy link
Collaborator

This PR adds an argument grad_chunk_size so that different chunk sizes can be used for the forward and backwards passes.

Generally speaking for a multi-layer perceptron with H hidden nodes and L layers the forward pass takes H memory and the backward pass takes H^2 + LH memory. So there's many situations where it's efficient to use a larger chunk size for the forward pass.

Comment on lines 652 to 655
if mutable is None:
mutable = self.mutable

return expect_and_forces(self, Ô, self.chunk_size, mutable=mutable)
return expect_and_forces(self, Ô, self.chunk_size, self.grad_chunk_size, mutable=mutable)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it make more sense to just do

     chunk_size = self.grad_chunk_size
     if chunk_size is None:
           chunk_size = self.chunk_size
     return expect_and_forces(self, Ô, self.chunk_size, mutable=mutable)

?

That way you don't need to touch everything else, because in any case there is just one chunk size being used in those underlying functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function forces_expect_hermitian_chunked uses both arguments. chunk_size to get the local energies, and grad_chunk_size to get the forces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, right. ok then

@PhilipVinc
Copy link
Member

I think it's a good addition. I just would like some tests to be added checking at least the setting/unsetting and gradient/

PhilipVinc added a commit that referenced this pull request Oct 25, 2023
Our funny @alleSini99 recently contributed a set of Renyi entropy
estimators, which are defined to inherit from `ÀbstractOperator`, so we
need to define some methods like `ìs_hermitian` that do not make much
sense for such object.

Moreover, to define the gradient, the dispatch rule for this observable
has this ugly-as-hell `TrueT`or `Literal[True]` that nobody besides me
understands.

This PR is an attempt to
 - Simplify the creation of a new generic operator/observable
 - Simplify the definition of signatures for dispatch of expect/grad by:
   - remove `use_covariance` argument from the general interface
- only keep `use covariance` for the expectation value of operators
where it make sense, and it will not be part of the dispatch signature

In practice...

- This introduces a new super type of AbstractOperator which I
call `AbstractObservable`. The difference between Abstract Operator and
AbstractObservable is that an Observable is very general and requires
nothing besides an Hilbert space. No is hermitian or dtype arguments. So
it should cover the most general case.

- Renyi entropy estimator is transitioned to this interface.

- The signature that users must define for expectation value estimators
will now be
```python
@dispatch
def expect(vs: MCState, ob: Reny2Entropy, chunk_size: Optional[int]):
  pass
```
and for gradients will be (the much simpler)
```python
@dispatch
def expect_and_grad(vs: MCState, ob: Reny2Entropy, chunk_size: Optional[int]):
  pass
```

Incidentally, this will make it simple to implement different types of
chunking like @chrisrothUT wants to do in #1590 by dispatching on a
tuple of integers for the chunk size instead of just an integer. Right
now the dispatch logic is very messy and this would not be easy to do.

Note that users are required to specify the chunk size, and if thy don-t
support it they have to explicitly state `chunk_size: None`. I could
relax this requirement but it makes user-defined code future-proofed in
case we add more arguments.

The main problem with those changes is that it breaks user-defined
operators using the past syntax. This is not strictly a problem because
this part of the interface is documented to be unstable, though it's
annoying.
I could add some inspect magic to detect usages of the old signatures
and auto-convert them to the new format and warn. To be experimented
with.
@codecov
Copy link

codecov bot commented Oct 29, 2023

Codecov Report

Attention: Patch coverage is 52.77778% with 17 lines in your changes are missing coverage. Please review.

Project coverage is 82.60%. Comparing base (c841005) to head (7454e7e).
Report is 1 commits behind head on master.

Files Patch % Lines
netket/vqs/mc/mc_state/expect_grad_chunked.py 41.66% 3 Missing and 4 partials ⚠️
netket/vqs/mc/mc_state/state.py 50.00% 7 Missing ⚠️
netket/vqs/mc/mc_state/expect_forces_chunked.py 70.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1590      +/-   ##
==========================================
- Coverage   82.73%   82.60%   -0.13%     
==========================================
  Files         299      299              
  Lines       18271    18299      +28     
  Branches     3480     3491      +11     
==========================================
  Hits        15116    15116              
- Misses       2476     2504      +28     
  Partials      679      679              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants