Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Make lammps_extract_fix consistent with lammps_extract_compute for global arrays #4118

Open
eltonfc opened this issue Apr 4, 2024 · 4 comments
Assignees

Comments

@eltonfc
Copy link

eltonfc commented Apr 4, 2024

Summary

While lammps_extract_compute returns a pointer to a scalar, vector or array, consistent with the LMP_TYPE_* argument given to it, lammps_extract_fix always returns a pointer to a single value, which is inconsistent with the referenced data structure and withnthe behavior of lammps_extract_compute.

I propose making the behavior consistent by lammps_extract_compute returning a pointer to a [copy of a] data structure described by the Type constant.

Detailed Description

The C API lammps_extract_compute function, when called with style constant LMP_STYLE_GLOBAL returns a pointer to a scalar, a vector or an array, depending on the Type constant.

On the other hand, lammps_extract_fix, when called with the same constants, returns a pointer to a single value, regardless whether it was called with LMP_TYPE_SCALAR, LMP_TYPE_VECTOR or LMP_TYPE_ARRAY. This is done by copying the value and returning a pointer to it.

Even though it is explictly warned in the documentation for the C interface and @akohlmey added that info to the Python API documentation, the behavior is inconsistent with lammps_extract_compute and with the underlying referenced data structure.

Since lammps_extract_fix already allocates memory and returns the pointer to it, it may be possible to allocate memory for the whole fix data structure (vector, array) and return a pointer to it.

Further Information, Files, and Links

This issue was discussed in the forums and this feature request suggested there.

Thank you!

@eltonfc
Copy link
Author

eltonfc commented Apr 4, 2024

I understand that there are historic and performance reasons to return a copy of the data and there may be significant impact in performance if the function were to copy the whole vector/array from all CPUs in a parallel run, yet in the case for many fixes, such as histograms, the API user would implement the looping through the vector/array and call the API to gather all values anyway, adding overhead to allocate/free all values.

@sjplimp
Copy link
Contributor

sjplimp commented Apr 9, 2024

@eltonfc The basic reason that extract_fix() returns global data (scalar, vector, array) differently than extract_compute() is
that fixes don't store global data directly like computes do. Instead they provide functions which allow a caller to access the global data one element at a time. For some fixes, those individual values are not stored, they are calculated when requested. Even if the values are stored by the fix, they are usually not stored contiguously for vectors or arrays.

However, I can see the utility for users to just make a single call to extract an entire global vector or array, similar to computes.

How about if we enhanced the syntax for the function call, using the nrow and ncol args. If nrow = -1 for a global vector, then
return the entire vector. In nrow = -1 and ncol = -1 for a global array, return the entire array. Otherwise if nrow >= 0, ncol >= 0,
just return a single value, as now. The function would allocate the vector or array, return a pointer, and it would still be the user's
responsibility to free it. The lammps_free() method would also need to be enhanced to free an array, since that is a more complex data struct.

@eltonfc and @akohlmey Tell me what you think of this idea?

@akohlmey
Copy link
Member

akohlmey commented Apr 9, 2024

@sjplimp I have a more radical approach in mind. But that is difficult to describe here. We need to talk about this offline.

@eltonfc
Copy link
Author

eltonfc commented Apr 9, 2024

If nrow = -1 for a global vector, then return the entire vector. In nrow = -1 and ncol = -1 for a global array, return the entire array

That's a good solution, which doesn't break the API, which was something I was worried about.

The function would allocate the vector or array, return a pointer, and it would still be the user's responsibility to free it. The lammps_free() method would also need to be enhanced to free an array, since that is a more complex data struct.

That sounds, good. lammps_free() would then need a flag to know whether to free a single float or a vector array, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: High-Priority Features
Development

No branches or pull requests

3 participants