Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussing the scope of the atomistic.software list #48

Open
ltalirz opened this issue Apr 16, 2021 · 3 comments
Open

Discussing the scope of the atomistic.software list #48

ltalirz opened this issue Apr 16, 2021 · 3 comments
Labels
policy Scope, organisation, etc.

Comments

@ltalirz
Copy link
Owner

ltalirz commented Apr 16, 2021

Please find below a conversation with @ceriottm , who kindly agreed (suggested, actually) to share this here as a record of the reasoning behind the current scope of the atomistic.software list and its evolution going forward.

Hello Leopold,
I hope this email finds you well. I stumbled upon this website
http://atomistic.software
that apparently you manage, and I was wondering if you could include also i-PI
http://ipi-code.org/
The main publications you can track for it are
https://www.sciencedirect.com/science/article/pii/S001046551300372X
and
https://www.sciencedirect.com/science/article/pii/S0010465518303436
Thanks a lot and all the best
Michele


Hi Michele,

thanks for reaching out!

I have already considered adding i-pi (see [1]) and my original impression - not having used the code myself - was that it seemed to be more like a wrapper of simulation engines rather than a simulation engine itself, going by the rough definition I'm currently using [2]

a piece of software that, given two sets of atomic elements and positions, can compute their (relative) internal energies. In most cases, engines will also be able to compute the derivative of the energy with respect to the positions, i.e. the forces on the atoms, and perform tasks like geometry optimizations or molecular dynamics.

The reason I'm making this distinction is that I currently don't have a category for wrapper/orchestration-type software and it is not clear to me how one could limit the scope of such a category in an elegant way. E.g. where should one draw the line in this list: i-pi => deepmd-kit => ASE => AiiDA => fireworks => [insert generic workflow manager here]?

That said,

  1. My understanding of how i-pi works might be wrong, and
  2. I'm very open to improve/modify the definition to include codes like i-pi, if it is possible to do it in a way that does not lead to the list exploding.

Please let me know your thoughts!

Cheers,
Leo

[1] #21
[2] https://github.com/ltalirz/atomistic-software#scope


Hello. I asked myself that question, but then I saw you had ASE in it which is as much as a wrapper as it gets. Sure it has a module to compute some
simple potentials, but so has i-PI and I would not argue about it being an "engine" based on that - it's not just how it is used by most of the people.
Personally I find the current definition of an engine arbitrary and unnecessarily narrow: ou already make an arbitrary exception for "spectroscopy" codes
and there's more to life than energy and (perhaps) forces ^_^'

Based on what the domain says, the line seems to be naturally drawn by the focus on "atomistic" simulations: I would not be surprised to see phonopy or
AiiDA on that list, I'd be surprised to see say signac or abaqus. I agree there's a risk of the list exploding - from that point of view I think it would make sense
to apply a "relevance" threshold, and to apply it retroactively as otherwise you'll get endless complaints.From that PoV I think that i-PI does not (currently)
meet the 100 citations criterion, and that seems to me a perfectly good reason to "wait and see": as I mentioned, the only thing that weakens that argument
is that half of the codes on that list don't make it. That's also a criterion that is easy to automate BTW so a big plus!

All the best
Michele


Thanks for sharing your thoughts, Michele, they are very welcome!

Part of the inconsistencies you mention stem from the fact that the original version of the list by Luca Ghiringelli [1] didn't have a relevance threshold and included codes like yambo and BerkeleyGW but listed them under "WFM" (berkeleygw) and "DFT" (yambo) although you typically can't compute total energies with them. It also included ASE.

Your comments make me think that I will need to remove the historical codes that don't meet the relevance criterion I imposed for new additions. I had documented the inconsistency here [2] but I fear people won't see it and get confused. I guess even if I had documented it more clearly in the "about" http://atomistic.software/#/about that would not solve the problem...
As for the threshold itself, the number is up for debate. As one can see in [3] there aren't all that many codes on the 20-100 citations/year watchlist (for the <20 citations, the watchlist is of course very incomplete), so one could imagine lowering it to something like 50, but 100 seemed like a reasonable round number.

As for the scope, I agree with your point about the definition being narrow and I'll think about how best to extend it. I think it's very positive if developers want to see their code on the list, and in the end the purpose of this list is to be a useful resource for practitioners in the field, so in that context having codes like i-pi and ASE on the list certainly makes sense.
If you were to pick a name for the category of codes like ASE or i-pi, what would it be?

Cheers,
Leo

[1] https://www.nomad-coe.eu/old-pages/externals/codes
[2] https://github.com/ltalirz/atomistic-software#adding-a-simulation-engine
[3] #21


Hi Leo,

I understand the "historical" side and TBH I think your n.1 goal should be not to get too much harassment for getting involved in the maintenance of this list.
To me, it would make sense really to make the process as automated as possible, and to set up things so that developers share as much of the burden as
possible. I think 100 cites is indeed a nice round number, and Google Scholar as a source is rounding up so I do think it's fair, and it is a fairly high bar so you
can be sure you won't get thousands of entries to worry about.

As for "categories" I think it would make your life much easier (goal n. 2!) to think in terms of "tags" - there I could think of having total energy; functional properties;
md and sampling; structure optimization and search; machine learning models; workflows and automation; analysis and visualization; .... - once again, the onus of
choosing tags might be on the developers rather than on you.

All the best
Michele


Hi Michele,

I understand the "historical" side and TBH I think your n.1 goal should be not to get too much harassment for getting involved in the maintenance of this list.
To me, it would make sense really to make the process as automated as possible, and to set up things so that developers share as much of the burden as
possible. I think 100 cites is indeed a nice round number, and Google Scholar as a source is rounding up so I do think it's fair, and it is a fairly high bar so you
can be sure you won't get thousands of entries to worry about.

Ok!

As for "categories" I think it would make your life much easier (goal n. 2!) to think in terms of "tags" - there I could think of having total energy; functional properties;
md and sampling; structure optimization and search; machine learning models; workflows and automation; analysis and visualization; .... - once again, the onus of
choosing tags might be on the developers rather than on you.

Thanks for the suggestions! The current "categorization" is already done in terms of tags - currently there is one set of tags for the method (dft/ff/tb/...) and one set of tags for more technical aspects.
Lumping all tags together would make life easy here... I wonder whether it still makes sense to let tags have a "type". I'll think about it over the weekend.

Cheers,
Leo

@ltalirz
Copy link
Owner Author

ltalirz commented Apr 19, 2021

As a first step, the 100 citations/year cutoff has now been enforced also on historical entries 30a0b69

@ceriottm
Copy link

Just had a quick browse through the commit diff - seems strange that dftb+ doesn't make the cut - the main publication is from 2007 and is listed on GScholar at 1500 cites and counting
image

@ltalirz
Copy link
Owner Author

ltalirz commented Apr 20, 2021

Thanks for checking! My gut feeling also was that DFTB+ was relatively widely used but then I didn't know for sure.
The citations of the paper in the year 2020 are shown as 187 .

I went through a few of them and they do generally seem to contain the term "DFTB+", which was the only query string used. This almost looks like an indexing issue to me, I'll see whether I can report it to Google Scholar.

In the meanwhile, we can switch to citations of the paper as the source. Fixed in 39eacac

P.S. I also just checked that that no other query string of codes in the list currently contains the "+" symbol.

@ltalirz ltalirz added the policy Scope, organisation, etc. label Jun 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
policy Scope, organisation, etc.
Projects
None yet
Development

No branches or pull requests

2 participants