Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Sliding Origin for MSD Computation #3796

Open
GenieTim opened this issue May 25, 2023 · 14 comments
Open

[Feature Request] Sliding Origin for MSD Computation #3796

GenieTim opened this issue May 25, 2023 · 14 comments

Comments

@GenieTim
Copy link
Contributor

Summary

The compute msd command currently only takes one origin into account.
One must call this command multiple times to achieve multiple origins, while in theory, it would be possible to compute the MSD using "all" origins to get very smooth results.

Detailed Description

I imagine one path to a sensible implementation to be this:
Using the following relation from Wikipedia:

Screen Shot 2023-05-25 at 09 37 48

, I would imagine an extension to fix ave/correlate/long to be a suitable way. Say, a parameter "msd", which, when used, additionally computes the two quantities <r^2(t)> and <r^2(0)> (it might be that I misunderstand the correlation computation a bit, but if I am not mistaken, <r^2(0)> is the first entry in the autocorrelation computation anyway) and automatically applies the formula above before outputing.

To me, this felt like it could be easy to implement; however, the code of fix ave/correlate/long is still not particularly easy to understand. Additionally, I feel like I am missing some understanding about the autocorrelation, that's why I hope you could give me a hint if I am missing an obvious way how this could be achieved in LAMMPS now already.

Further Information, Files, and Links

The Wikipedia I meantioned above: MSD on Wikipedia
Other random MSD implementation using multiple-tau autocorrelation: jkriege2/StatisticsTools

@Bibobu
Copy link
Collaborator

Bibobu commented May 25, 2023

Hi @GenieTim,
Seeing this FR made me think about a couple comments I could make here

  • From experience, MSD computation on-the-fly is nice but quickly gets computationally intensive using multiple-tau algorithms. You basically end up doing averages for short time intervals over and over along a given trajectory.
    An elegant alternative, both faster and needing less memory, is using the Fourier transform product of the positions. See here for several algorithms in Python.
    As far as I've seen, I found no alternative more interesting than doing this as a post-process operation. Buy I'll be interested in any suggestion to implement on-the-fly processes.

  • The fix ave/correlate/long has several differences with the fix ave/correlate concerning access to the output and I think the latter has seen several changes compared to the former. I also think that neither of them are designed to handle atom like variable at the moment. This might need significant work as an additional feature. I would still personally start with fix ave/correlate but for atom like variable, I think these computation can quickly need big amounts of memory even with the correlate/long optimisations.

@GenieTim
Copy link
Contributor Author

Thanks for your insights, @Bibobu.

Two more references:

@GenieTim
Copy link
Contributor Author

GenieTim commented Feb 13, 2024

After close to a year, I had to revisit this, and might have found how to do it without any additional implementation required (i.e., using the existing fix ave/correlate/long):

# group of atoms we are intrested in
group centralBeads type 5

# property we need
compute pos centralBeads property/atom xu yu zu
# assign one chunk per atom
compute allids centralBeads property/atom id
compute centralAtomChunks centralBeads chunk/atom c_allids ids once nchunk once discard yes compress yes

# turn the per/atom values into a global (chunk) vector
fix globalCentralPos centralBeads ave/chunk 1 1 1 centralAtomChunks c_pos[1] c_pos[2] c_pos[3]
variable globalCentralX vector f_globalCentralPos[3]
variable globalCentralY vector f_globalCentralPos[4]
variable globalCentralZ vector f_globalCentralPos[5]

variable globalCentralyX2 vector f_globalCentralPos[3]^2
variable globalCentralyY2 vector f_globalCentralPos[4]^2
variable globalCentralyZ2 vector f_globalCentralPos[5]^2

# finally, actually conduct the autocorrelation of the position for all the selected atoms
fix posCorr centralBeads ave/correlate/long 1 100000 v_globalCentralX[*10] v_globalCentralY[*10] v_globalCentralZ[*10] type auto nlen 16 ncount 2 ncorr 30 file output/autocorr-of-pos.all-atoms.out.correlate.txt

# also, for the formula, we also need the mean
fix outPos centralBeads ave/time 1 100000 100000 v_globalCentralX2 v_globalCentralY2 v_globalCentralZ2 file output/pos.all-atoms.out.vec-avg.txt mode vector

the rest (the formula shown above) can be done in post-processing.
This is how to – as far as I see it – one can achieve multiple-tau MSD computation: in-flight, with moderate memory requirements (since I am studying trajectories of 1e8 steps and up, outputing the coordinates on every step got infeasible, and outputting irregularly just felt unscientific).

@S-Lykles
Copy link
Contributor

@GenieTim your script seems to be calculating <r(t)>*<r(0)> and not <r(t)*r(0)>, are you sure they're equivalent?

@GenieTim
Copy link
Contributor Author

Hm... @S-Lykles I would argue my script computes:

  • with fix ave/correlate/long: <x(t)x(0)>, <y(t)y(0)>, <y(t)y(0)>, where the brackets denote the time-average (not the central-beads average). I get <r(t)r(0)> as = <x(t)x(0)> + <y(t)y(0)> + <y(t)y(0)>
  • with fix ave/time: <x(t)^2>, <y(t)^2>, <z(t)^2>, in equilibrium equal to <x(0)^2>, <y(0)^2>, <z(0)^2>. Again, I can use <r(t)^2> = <x(t)^2> + <y(t)^2> + <z(t)^2> (YES, I just edited my script, the square was missing indeed)
  • finally, in post-processing, I can compute the MSD as: MSD = 2*<r(t)^2> - 2*<r(t)r(0)> (assuming equilibrium)

But I am thankful for any issues you see.

@S-Lykles
Copy link
Contributor

Ah my bad, I thought that variable globalCentralX vector f_globalCentralPos[3] was calculating <x(t)> because of the ave/chunk but I see what it is doing now. Am I correct though that v_globalCentralX[*10] is taking only the first 10 atoms, so the autocorrelation is performed only for 10 atoms?
I suppose your solution still highlights the desire to have a fix ave/correlate that could correlate per atom variables, as your method serialises the correlation which would be very slow if you wanted to sample all atoms in a big system.

@GenieTim
Copy link
Contributor Author

GenieTim commented Feb 20, 2024

Yes, you are right, the "*10" limits to 10, as this is the number of atoms in the centralBeads group in my case. For some reason, fix ave/correlate/long does not work with "*" only, so this has to be done manually for now.

Yes, absolutely, the conversion between atom- and global vectors like this feels very smelly and is a waste of resources, but there does not seem to be another way, yet, as far as I know.

@S-Lykles
Copy link
Contributor

I was working on a custom fix to calculate MSD and some other properties, but this post made me realize that writing a fix ave/atom/correlate would be sufficient and might have other use cases as well. Ill give it a go, thanks.

@GenieTim
Copy link
Contributor Author

Could you kindly reference me or this issue, @S-Lykles, please, when you have a working implementation? Thanks.

@GenieTim
Copy link
Contributor Author

GenieTim commented Mar 8, 2024

While my code works, comparing it to results from outputting the coordinates and computing the MSD externally indicates some considerable issues with the multiple-tau method; not sure if it actually is an issue of mulitple-tau or my post-processing, but so far, I have not found an issue for the relatively simple script, yet there seems to exist some considerable strangeness:

msd-300-t-0 0

@S-Lykles
Copy link
Contributor

S-Lykles commented Mar 9, 2024

@GenieTim In the paper about the multiple tau correlator they mention:

"F. Nonaveraged algorithm
For some observables, such as the mean-square displace-ment, the systematic error due to averaging can become verylarge, which can be resolved using corrections. However, ingeneral, the derivation of these corrections can become quitetedious (e.g., for higher moments of the displacement). Moreover, they also do not correct the smoothing effect (i.e.,the first sum of Eq. (11)). Another way of resolving this issueis not to submit average values to the correlator"

I have a fork where I added a an option to ave/correlate/long to use the nonaveraged algorithm. I havent gotten around to testing it yet but I plan to make a pull request once I have.

@S-Lykles
Copy link
Contributor

S-Lykles commented Mar 9, 2024

You can find it here, if you like you can help test it for me :)

https://github.com/S-Lykles/lammps/tree/nonaveraged_correlate_long

@GenieTim
Copy link
Contributor Author

GenieTim commented Mar 9, 2024

Ah, you are right, that slipped my mind, thanks for pointing it out. I will gladly have a try at your implementation early next week.

@GenieTim
Copy link
Contributor Author

Ok, so, I have given your implementation a try. It seems to work just fine. However, I have realised, that there is another caveat and corresponding trick to improve the accuracy of the multiple-tau method: rather than doing the whole fix outPos centralBeads ave/time 1 100000 100000 v_globalCentralX2 v_globalCentralY2 v_globalCentralZ2 file output/pos.all-atoms.out.vec-avg.txt mode vector, it seems that the average at t = 0 is not consistent enough with the multiple-tau autocorrelation, meaning, by instead using the very first entry of the multiple-tau autocorrelation (<r(0)r(0)>) for subtraction, we get more reasonable correspondences:

msd-300-t-0 0

(in this figure, only N = 80 got the non-averaged treatment, all other systems are the same as in the plot above, except for what I used as baseline)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Looking for a volunteer
Development

No branches or pull requests

4 participants