-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add grad chunk size #1590
base: master
Are you sure you want to change the base?
Add grad chunk size #1590
Conversation
netket/vqs/mc/mc_state/state.py
Outdated
if mutable is None: | ||
mutable = self.mutable | ||
|
||
return expect_and_forces(self, Ô, self.chunk_size, mutable=mutable) | ||
return expect_and_forces(self, Ô, self.chunk_size, self.grad_chunk_size, mutable=mutable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it make more sense to just do
chunk_size = self.grad_chunk_size
if chunk_size is None:
chunk_size = self.chunk_size
return expect_and_forces(self, Ô, self.chunk_size, mutable=mutable)
?
That way you don't need to touch everything else, because in any case there is just one chunk size being used in those underlying functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function forces_expect_hermitian_chunked
uses both arguments. chunk_size
to get the local energies, and grad_chunk_size
to get the forces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, right. ok then
I think it's a good addition. I just would like some tests to be added checking at least the setting/unsetting and gradient/ |
562101f
to
5d8bad1
Compare
Our funny @alleSini99 recently contributed a set of Renyi entropy estimators, which are defined to inherit from `ÀbstractOperator`, so we need to define some methods like `ìs_hermitian` that do not make much sense for such object. Moreover, to define the gradient, the dispatch rule for this observable has this ugly-as-hell `TrueT`or `Literal[True]` that nobody besides me understands. This PR is an attempt to - Simplify the creation of a new generic operator/observable - Simplify the definition of signatures for dispatch of expect/grad by: - remove `use_covariance` argument from the general interface - only keep `use covariance` for the expectation value of operators where it make sense, and it will not be part of the dispatch signature In practice... - This introduces a new super type of AbstractOperator which I call `AbstractObservable`. The difference between Abstract Operator and AbstractObservable is that an Observable is very general and requires nothing besides an Hilbert space. No is hermitian or dtype arguments. So it should cover the most general case. - Renyi entropy estimator is transitioned to this interface. - The signature that users must define for expectation value estimators will now be ```python @dispatch def expect(vs: MCState, ob: Reny2Entropy, chunk_size: Optional[int]): pass ``` and for gradients will be (the much simpler) ```python @dispatch def expect_and_grad(vs: MCState, ob: Reny2Entropy, chunk_size: Optional[int]): pass ``` Incidentally, this will make it simple to implement different types of chunking like @chrisrothUT wants to do in #1590 by dispatching on a tuple of integers for the chunk size instead of just an integer. Right now the dispatch logic is very messy and this would not be easy to do. Note that users are required to specify the chunk size, and if thy don-t support it they have to explicitly state `chunk_size: None`. I could relax this requirement but it makes user-defined code future-proofed in case we add more arguments. The main problem with those changes is that it breaks user-defined operators using the past syntax. This is not strictly a problem because this part of the interface is documented to be unstable, though it's annoying. I could add some inspect magic to detect usages of the old signatures and auto-convert them to the new format and warn. To be experimented with.
b13b4b1
to
e4ee419
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1590 +/- ##
==========================================
- Coverage 82.73% 82.60% -0.13%
==========================================
Files 299 299
Lines 18271 18299 +28
Branches 3480 3491 +11
==========================================
Hits 15116 15116
- Misses 2476 2504 +28
Partials 679 679 ☔ View full report in Codecov by Sentry. |
e4ee419
to
7454e7e
Compare
This PR adds an argument grad_chunk_size so that different chunk sizes can be used for the forward and backwards passes.
Generally speaking for a multi-layer perceptron with H hidden nodes and L layers the forward pass takes H memory and the backward pass takes H^2 + LH memory. So there's many situations where it's efficient to use a larger chunk size for the forward pass.