Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gp/continuous geometry #1583

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Conversation

gpescia
Copy link
Collaborator

@gpescia gpescia commented Sep 20, 2023

First draft to implement the geometry for continuous space.
There is continous_geometry.py as base class for specialized ones: Cell, Free.
At the moment only these two are available (so no mix of boundary conditions).
I commented out all tests in hilbert that test for mixed boundary conditions as well as the Langevin sampler.
Examples run fine on my machine.

gpescia added 3 commits September 19, 2023 16:55
…nly two classes are implemented: Cell = fully periodic, Free = no pbc at all. Commented out tests with mixed boundary conditions. Commented out tests of Langevin sampler
@PhilipVinc
Copy link
Member

  • We should decide where to put the geometry stuff. I'm not really in favour of .graph let's see what @gcarleo thinks about
  • We should not break old interfaces. So ContinuousHilbert should still accept the old arguments, and also geometry. If geometry is none it should use the other arguments to build a geomtry
  • This way of implementing the random states is not ok

Before actually implementing this stuff, I think someone should lay out clearly what we want to do with it. For example, do we need the inverse lattice? Why? What is the expected use case of this?

For example, the messy translation code in the sampling makes me suspect that the abstract class should have a method translate_positions_by_vector or something like that...

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

We should not break old interfaces. So ContinuousHilbert should still accept the old arguments, and also geometry. If geometry is none it should use the other arguments to build a geomtry

yes true, I‘ll add the other as well.

This way of implementing the random states is not ok

In the end I guess we want to dispatch on cell/free.

For example, do we need the inverse lattice? Why? What is the expected use case of this?

yes to go from lattice basis to standard basis and back.

@PhilipVinc
Copy link
Member

yes to go from lattice basis to standard basis and back.

In what part of the code is this needed? If this is not used in the code, can you give an example of what it can be used for and how it would be used?

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

Yes for the position of the module, it's a secondary problem, but I also wouldn't put it in Graph. Anyways, there is time to think about that...

In my view the Geometry object should have a minimal interface to start with allowing to:

  1. Compute distances between any two points R1,R2 that are valid for the given Geometry. This method is just distance and should know nothing about periodicity etc explicitly (i.e. you don't pass pbc=True etc to distance) it is the essence of the Geometry to know how the distance should be computed (e.g. with minimum image etc)
  2. Generate a valid random point R
  3. Add/subtract vectors in this space (say implement R_1 + R_2)? (not sure about this point but maybe it's handy for sampling)

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

In the sampler and the hilbert space random state. We sample fractional coordinates, for this we need to go from standard to lattice basis back to standard

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

In the sampler and the hilbert space random state. We sample fractional coordinates, for this we need to go from standard to lattice basis back to standard

but that's why the random points should be generated in the Geometry, not in Hilbert, see point 2 of the list I made above. I think this allows to simplify most of the code used for sampling

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

  1. Well you need to tell the function whether you want periodic in the cell or minimum image convention. For this you need the ‚mode’ I would say. In the free case you don‘t need it.

  2. The only difference is the placement of the random state right? One still goes nack and forth between the two bases.

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

  1. Yes but you specify periodic or minimum image when you construct Cell, not when you call Distance, in this way you completely separate the sampling and Hilbert from these details of the geometry

  2. It's important to disentangle everything that relates to Geometry from Hilbert and the Sampling, so that you have a general code for the last two that can work for any Geometry

@PhilipVinc
Copy link
Member

Well you need to tell the function whether you want periodic in the cell or minimum image convention. For this you need the ‚mode’ I would say. In the free case you don‘t need it.

Are there cases where for the same geometry one wants one of those two different distances? Can you give an example?
Can we think a case of a default distance to be used mostly everywhere? In that case Geometry objects could expose different distance functions, and a default one to be used for interaction potentials or I don't know what else.

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

I think there is no reasonable case in which you want to use two different distances within the same simulation Cell, that's why I think the Cell type should be used to identify uniquely the type of distance used as well

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

Well potential energies are computed always with minimum image, and distances for the wf never. So you need both…

@PhilipVinc
Copy link
Member

What do you mean by distances for the wf ?

@codecov
Copy link

codecov bot commented Sep 20, 2023

Codecov Report

Attention: 60 lines in your changes are missing coverage. Please review.

Comparison is base (ee98fef) 83.53% compared to head (a88a945) 84.22%.
Report is 7 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1583      +/-   ##
==========================================
+ Coverage   83.53%   84.22%   +0.69%     
==========================================
  Files         257      261       +4     
  Lines       14519    15155     +636     
  Branches     2203     2232      +29     
==========================================
+ Hits        12129    12765     +636     
- Misses       1841     1852      +11     
+ Partials      549      538      -11     
Files Coverage Δ
netket/experimental/__init__.py 100.00% <100.00%> (ø)
netket/experimental/geometry/__init__.py 100.00% <100.00%> (ø)
netket/hilbert/continuous_hilbert.py 100.00% <100.00%> (+22.22%) ⬆️
netket/hilbert/random/particle.py 84.21% <100.00%> (-15.79%) ⬇️
netket/models/deepset.py 94.44% <100.00%> (+1.11%) ⬆️
netket/operator/_kinetic.py 94.82% <100.00%> (ø)
netket/sampler/rules/continuous_gaussian.py 89.47% <100.00%> (-1.44%) ⬇️
...etket/experimental/geometry/continuous_geometry.py 83.33% <83.33%> (ø)
netket/hilbert/particle.py 87.03% <78.78%> (-10.47%) ⬇️
netket/experimental/geometry/Free.py 53.65% <53.65%> (ø)
... and 1 more

... and 76 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

It's important to disentangle everything that relates to Geometry from Hilbert and the Sampling, so that you have a general code for the last two that can work for any Geometry

But random state is actually a state in hilbert not in the geometry, right?

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

distance features used in the wavefunction

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

sorry Gabriel but is it truly necessary to use two different distances? I would say one should be enough / is much nicer than using two different distances for the same space

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

in any case, maybe it's more efficient to talk on zoom tomorrow or so

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

Sure we can have a meeting! I‘m just saying that there are two distance measures which both get used. I‘m not sure how to optimally resolve that..

@PhilipVinc
Copy link
Member

I would love to understand why you are using more than one distance.
If I understand it's because in the nn/variational ansatz you want to use the standard distance instead of the minimum image.
But physically this, I think, is wrong, because the minimum image is the physical one in a periodic system.
So what is the underlying reason of this? Performance? It works? Or maybe I fail to understand?

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

The reason is smootheness. The minimum image convention reproduces Euclidean distance as you say but it is not smooth and cannot be used as input to the wavefunction as we take derivatives w.r.t. position. The other distance (sin-distance) behaves similar as the minimum image distance but is smooth. So for potential energies (no derivatives involved) we want the "correct" euclidean distance. For the wavefunction we take the less correct but smooth version.

@PhilipVinc
Copy link
Member

Thanks! This was the explanation I was looking for.
I think that to maintain flexibility we could should just implement both as two different functions. Then in the models, one uses what he wants.
But to be further discussed..

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

OK, not ideal, but then this is solved maybe specifying what type of distance you want, it's true that even in euclidean space you might want L1,L2... etc

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

sure we can do that too. But that is basically the same as setting a 'mode' flag or do you gain performance with this?

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

Yes I think it's just a flag to be passed: geo.Distance(R1,R2,type='L2') etc

@PhilipVinc
Copy link
Member

It's not about performance. It's about having a clean and sane API.
I think geometry.distance should always give the physical distance, aka minimum image.
In some cases you might want to cheat / use non physical distances in which case those should have different names.
mode might hide them a bit.
But I'm not completely convinced.
Maybe mode is good.
Would like to hear opinions.

@gcarleo
Copy link
Member

gcarleo commented Sep 20, 2023

if you want the main problem is where the type flag is to be used. I think for the ansatz there is no problem since you will have an ansatz specific to periodic systems that will also know about different types of distance that a Cell can accept. Now we have to understand how we make the sampler agnostic somehow of the type of Geometry

@PhilipVinc the physical distance in a sense is not well defined, it might be that it is L1 that makes sense and not L2, depending on the problem (just to give an example). In the same spirit for the periodic system one type might make more sense than the other type of distance...

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 20, 2023

Ah seems it falls back on the system version of black rather than the one of the environment when running it as a script.
python -m black {source_file_or_directory} works.

…ostic to hilbert. Distance functions for free and periodic space. Hash does not yet work. Langevin not yet working.
@gpescia
Copy link
Collaborator Author

gpescia commented Sep 21, 2023

Does it destroy back-compatibility if we error out when mixed boundary conditions are used? I have never seen boundary conditions in one direction but not in the other ones in the context of continuous space. The only place this gets used is netket tests. I would be very much in favour for removing this.

@gcarleo
Copy link
Member

gcarleo commented Sep 21, 2023

I am not sure it's necessary to keep back compatible interfaces though, this is all in netket.experimental, part of the deal is that we can break these interfaces without major releases

@PhilipVinc
Copy link
Member

This is not in netket.experimental. We could break a bit, but for a good reason (and I'm not sure that creating a geometry is necessary all the time)

@gpescia
Copy link
Collaborator Author

gpescia commented Sep 21, 2023

I just think that supporting mixed boundary conditions makes it more complicated to take care of all cases while not providing any benefit in terms of usability. If you really want mixed boundary conditions just take pbc everywhere and make one dimension very large...
But that is up to discussion I guess.

Edit: Also I have no idea what would be the correct distance in this case and how to correctly do the periodic decomposition.

@PhilipVinc
Copy link
Member

I'm not against dropping support for mixed boundary conditions, though due to MPS it's a not-so-uncommon scenario.
It could raise a not implemented error for now.

Copy link
Member

@PhilipVinc PhilipVinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that we agree on the minimal interface of the ÀbstractGeomtry object before going forward to avoid wasting time and effort...
The interface means defining every attribute and method name with its signature.
Once this is defined, it can be implemented.

My proposal is

class AbstractGeometry(abc.ABC):
   
   @property
   def n_dimensions(self) -> int:
        # the number of physical dimensions

    def distance(self, x: Array, y: Array) -> Array:
         # computes the physical distance. not necessarily smooth or differentiable.

     def translate(self, x: Array, R : Array) -> Array:
         # translates configuration x by vector R

     def random_uniform(self, shape: Union[int, Tuple[int, ...]], dtype: DType) -> Array:
          # random configurations uniform distribution

We still need to:

  • decide how to compute other distances. Should it be a mode=... keyword argument or a different function? if it's a mode, how should it be handled by spaces that don't support it?
    • Does this mean that we must implement the same mode everywhere? What should it be called the mode that gives the L2 distance (not minimum image)?
  • If we decide not to use mode how do we write general nn models that work with arbitrary geometries? should we add a distance_differentiable(x,y) method to this abstract class?
  • Positioning of this stuff. I propose adding the namespace netket.experimental.geometry for the time being and placing this stuff there. Eventually we can move this to netket.geometry.

@gpescia should confirm if he believes this minimal interface is enough to do everything we need.

Then, Cell and FreeSpace can add new properties and eventually new methods that are useful.

What is the Latticeargument that Gabriel is using? what does it represent?

netket/graph/continuous_geometry.py Outdated Show resolved Hide resolved
netket/graph/continuous_geometry.py Outdated Show resolved Hide resolved
netket/graph/continuous_geometry.py Outdated Show resolved Hide resolved
netket/graph/continuous_geometry.py Outdated Show resolved Hide resolved
netket/graph/continuous_geometry.py Outdated Show resolved Hide resolved
r"""Spatial extension in each spatial dimension"""
raise NotImplementedError

def distance(self, x, y=None, norm=False, mode=None) -> list[list]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why y=None? what does it mean? is it needed? what is norm?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed what is norm? mode should be enough I guess (e.g. mode=L2 )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If y=None it means distance(x,x) with the caveat that the norm is computed differently in that case.

To not having to call the method twice in case you want the norm and the vectors, there is the extra flag norm. We could have a seperate mode keyword for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. The definition of a distance is such that distance(x,x) is zero.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not if x contains N different positions. x can have shape (nsamples, nparticles, sdim) then only the diagonal is 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the case of x being a batch of samples, you return a matrix nsamples x nsamples of distances?

Can't we do this by having the following signature for the distance function?

  • distance( x:Array[N], y: Array[N]) -> Array[] # (a scalar)
  • distance( x:Array[B1, N], y: Array[N]) -> Array[B1]
  • distance( x:Array[N], y: Array[B2,N]) -> Array[B2]
  • distance( x:Array[B1, N], y: Array[B2,N]) -> Array[B1, B2]
  • distance( x:Array[B1, B2, N], y: Array[B3, N]) -> error

Copy link
Collaborator Author

@gpescia gpescia Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nsamples, nparticles, nparticles, sdim) or the upper triangular in case specified (nsamples, nparticles*(nparticles-1)/2, sdim).

Now the function works for x=(…,N,sdim) and y=(…,M,sdim) but yes we could use dispatch.

netket/graph/continuous_geometry.py Outdated Show resolved Hide resolved
netket/graph/continuous_geometry.py Outdated Show resolved Hide resolved
put between 0 and 1."""
raise NotImplementedError

def from_lat_to_standard(self, x):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we agree on the name? I don't like the arbitrary abbreviation. I don't understand what standard means. Where is this used? Is it necessary to have it part of the base class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this should not be in the base class it's a specific thing of Cell I guess

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a naming issue. What lattice is supposed to be is a basis of the physical space. Standard is the standard basis e.g. (1.0),(0,1) in 2D. So a better naming instead of lattice is probably basis. The two functions then allow to transform back and forth between the two bases.
I actually think it belongs to the base class. I‘m not sure of a use case right now but also in free space it might make sense to define another basis than the standard basis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should try to use terminology that is also consistent with Graph.Lattice (actually that class contains most of the periodic cell functionalities already, so there might be overlapping code we can reuse?)

netket/graph/continuous_geometry.py Outdated Show resolved Hide resolved
@gcarleo
Copy link
Member

gcarleo commented Sep 21, 2023

I agree we should first settle on an Abstract Geometry class. I think what @PhilipVinc is ok, but I would make some changes (mostly, I think translate is too specific, the general thing is adding vectors or performing other operations)

class AbstractGeometry(abc.ABC):
   
   @property
   def n_dimensions(self) -> int:
        # the number of physical dimensions

    def distance(self, x: Array, y: Array, mode=None) -> Array:
         # computes the physical distance. not necessarily smooth or differentiable.
         # every geometry implements different modes that are specific of that geometry

    def  add(self, A: Array, B: Array): -> Array:
         # A+B with folding into pbc if prescribed by the geometry etc 

     def random_uniform(self, shape: Union[int, Tuple[int, ...]], dtype: DType) -> Array:
          # random configurations uniform distribution

I also endorse netket.experimental.geometry for the time being

@PhilipVinc
Copy link
Member

def __add__(self, A: Array, B: Array): -> Array: is to implement addition of the object itself, namely geometry, not two third party objects,

@gcarleo
Copy link
Member

gcarleo commented Sep 21, 2023

yes, ok... indeed I think the main issue is that a vector should not be a vector but a Geometry object actually

@gcarleo
Copy link
Member

gcarleo commented Sep 21, 2023

or, better, a Geomety has Elements that are what you are calling Array now

@gcarleo
Copy link
Member

gcarleo commented Sep 21, 2023

or, much easier:

def add(self,A,B) this seems legitimate to me. You just call geo.add(R1,R2)

@PhilipVinc
Copy link
Member

Yes, a PointNd, which is generated by the Geometry object.
But let's not get into such details because a custom object would break all of jax and let's implement Geometry.add(x, y), what do you think?

@PhilipVinc
Copy link
Member

Ok, what about the modeargument to distance? what should be the default? what are other possible values we reasonably can think about?

@gcarleo
Copy link
Member

gcarleo commented Sep 21, 2023

It might be good also to check what other establidhed libraries do, I found this for example https://wiki.fysik.dtu.dk/ase/ase/atoms.html#ase.Atoms.get_distance

@PhilipVinc
Copy link
Member

@gcarleo one more thing that comes to my mind: geometry could be useful (think necessary) to specify the partition one wants to compute entanglement entropy for in a continuous system.

It would be useful to think about how to handle/expose specifying a sub-part of the system there.

positions back to the geometry defined."""

@abc.abstractmethod
def random_init(self, shape):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't call this random_init, rather just random_uniform or similar? (the init part is not relevant from the Geometry point of view)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For free space I just return zeros, so it is also not really uniform (also not random actually).. So both names are a bit misleading maybe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had proposed random_defaulta while back

For free space this could just return zeros. For periodic space one could initialize on a lattice.
N: int, number of particles."""

def from_basis_to_standard(self, x):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure these belong to the general interface, for example they seem to be used only by Cell?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now yes, but I think the defining property of a geometry is a basis, and there might be instances where you want to define a different basis than the standard basis also for free space. Then going back and forth between the custom basis and the standard basis should be part of geometry in my view.

pbc: Tuple or bool indicating whether to use periodic boundary conditions in a given physical dimension.
If tuple it must have the same length as domain. If bool the same value is used for all the dimensions
defined in domain.
pbc: Tuple of bools or bool indicating whether to use periodic boundary conditions in a given physical
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally, I would be in favor of breaking the interface instead of adding deprecations and convoluted consutrctors etc... but OK, it's not so user friendly maybe :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants