Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config Design #292

Closed
FabianHofmann opened this issue Nov 17, 2021 · 10 comments
Closed

Config Design #292

FabianHofmann opened this issue Nov 17, 2021 · 10 comments

Comments

@FabianHofmann
Copy link
Contributor

I thought a bit about how to deal with the mixture of config.default.yaml, config.yaml and the code-based parameters which are either keyword arguments in the functions or config.get()-defaults. I think this make the whole workflow hard to track and nasty to update. The best solution I can think of is the following:

There are no parameter settings in the code, means no keyword arguments and dict.get()-defaults. All the parameter come from the config files. The basis config file is the config.default.yaml, the config.yaml is (by default) an empty file which is only used to update (!) the config.default.yaml. The start of the Snakefile would include this

from pathlib import Path
from snakemake.utils import update_config
from snakemake.io import load_configfile

configfile: "config.default.yaml" # only for setting defaults, do not change

customconfig = Path("config.yaml") # custom configuration  
customconfig.touch(exist_ok=True)
update_config(config, load_configfile(customconfig))

Advantages:

  • Unique set of parameters.
  • When updating pypsa-eur, one does not have to merge old and new config files, i.e. look for new default values and insert them into the existent config.yaml.
  • Config.yaml files are getting much smaller and only contain effective changes against the default. Also accounts for the test configs. We don't have to align them every time the config changes.

Disadvantages:

  • Users have to be aware of the fact that the configfile argument is only for defaults. Perhaps it becomes clearer when setting a CUSTOMCONFIG argument?

Any thought @fneum @euronion @martacki ?

@FabianHofmann
Copy link
Contributor Author

#275

@koen-vg
Copy link
Contributor

koen-vg commented Nov 25, 2021

I know I wasn't asked but I do have some thoughts on this :)

  • I like the idea of separating "default" and "custom" config, which would just make it a lot easier to see at a glance what is different about a particular config.

  • Consider the use-case of including pypsa-eur as a snakemake subworkflow (in pypsa-eur-sec or some other project). Ideally you want to do something like

    subworkflow pypsaeur:
        <...>
        configfile: "config/pypsa-eur-config.yaml"
    

    But with your current suggestion, that would override the "default" config?

More generally speaking (maybe a bit beyond what this issue was meant for), I think a pain point with using pypsa-eur right now is that, loosely speaking, snakemake isn't "aware" that most network files depend on the config. This leads to situations like when you change the countries section of the config, you have to delete all network files and start over. It's also cumbersome to compare models generated with different configurations: the easiest thing is just to have completely separate pypsa-eur projects.

One rough idea I came up with is that snakemake could compute a short hash of the config.yaml file1 and use that hash as a wildcard. You would have network files like {hash}_base.nc, {hash}_elec.nc, {hash}_elec_s{simpl}_.... Then, when you change the config, you'll get a new hash and a new set of network files.

This also brings some problems: maybe not all the files had to be regenerated, and keeping track of configs and their hashes might be a bit of work. There is also the pypsa-eur-sec solution: create different named "run" directories which each can have their own config. I would also appreciate some kind of solution like this for pypsa-eur, in order to be able to work with multiple configs conveniently.

The "holy grail" would be for the whole workflow to "be aware" of which config changes affect which files, so that I could for example change the solving config, run snakemake again and have it automatically re-run just the solve_network rule because I changed config related to that rule (but only that rule). This seems like a distant goal though, and maybe not technically feasible.

Footnotes

  1. Or a relevant part of the file, or the actual nested dictionary structure.

@euronion
Copy link
Contributor

euronion commented Dec 1, 2021

I like were you're idea is going @FabianHofmann . I'd suggest a small change which would address the first issue @koen-vg is pointing out:

from pathlib import Path
from snakemake.utils import update_config
from snakemake.io import load_configfile

configfile: "my-great-custom-config.yaml"

# only for setting defaults, do not change
default_configfile: "config.default.yaml"

config = load_configfile(default_configfile)
update_config(config, load_configfile(configfile))

Changes:

  • No .touch(exists_ok=True): Force users to create a config file, even if it is empty (and thus force the user to think about it for 30 seconds at least)
  • The overwrite config is called configfile, thus should (hopefully) be correctly overwritten by specifying a configfile in a subworkflow or by using the --configfile=<...> CLI flag to overwrite configfile.

I'll try it in my trace model and see how it works.

Regarding @koen-vg second point: I think the most pragmatic solution would indeed be promoting all relevant config entries to param entries for each rule, thus making the dependency explicit. Like Calliope. And then use Tim Troendle's PR here to re-run rules with changed params.

euronion added a commit to euronion/trace that referenced this issue Dec 1, 2021
Enable a default config and specific configs overwriting the defaults. See idea from PyPSA/pypsa-eur#292 .
@euronion
Copy link
Contributor

euronion commented Dec 1, 2021

The code above does not work. This one does:

# Specify config file
configfile: "config/<custom_config>.yaml"

# Default configs - do not change
default_configfile="config/config.default.yaml"

# Load default config and overwrite with specific config
specific_config = config.copy()
config = load_configfile(Path(default_configfile))
update_config(config, specific_config)

@FabianHofmann
Copy link
Contributor Author

FabianHofmann commented Dec 1, 2021

Thanks for your thought @euronion, @koen-vg .

The code above does not work. This one does:

# Specify config file
configfile: "config/<custom_config>.yaml"

# Default configs - do not change
default_configfile="config/config.default.yaml"

# Load default config and overwrite with specific config
specific_config = config.copy()
config = load_configfile(Path(default_configfile))
update_config(config, specific_config)

Good point! In this case the user has to explicitly create and set the <config_config.yaml>? I like the idea of forcing the user to think about it, on the other hand the default setting should work out of the box. Is it too much if we set configfile per default to config/config.default.yaml?

# Specify config file
configfile: "config/config.default.yaml"

# Default configs - do not change
default_configfile="config/config.default.yaml"

# Load default config and overwrite with specific config
specific_config = config.copy()
config = load_configfile(Path(default_configfile))
update_config(config, specific_config)

EDIT:
ah no... actually that does not look nice...

@euronion
Copy link
Contributor

euronion commented Dec 1, 2021

We could set the default to the tutorial config, that way the mechanism is already shown and it works out of the box.

configfile: "config/config.tutorial.yaml"

@FabianHofmann
Copy link
Contributor Author

We could set the default to the tutorial config, that way the mechanism is already shown and it works out of the box.

configfile: "config/config.tutorial.yaml"

Great idea!

@euronion
Copy link
Contributor

euronion commented Dec 1, 2021

Should we prep it already for the next version? I can prepare the PR tomorrow.

The other issue also mentioned by @koen-vg and discussed by us on Monday off-list should be a separate PR. I believe the param solution is probably the best one. Could be a nice "help wanted" "enhancement" ;)

@FabianHofmann
Copy link
Contributor Author

That would be great! Let's prioritize this issue over #275 since the latter will become easier afterwards (both should be in the next release)

euronion added a commit that referenced this issue Dec 2, 2021
@euronion euronion added this to the Release 0.5 milestone Jan 10, 2022
@fneum fneum removed this from the Release 0.5 milestone Jan 13, 2022
fneum added a commit that referenced this issue Mar 6, 2023
Add small spelling/maths correction to documentation
@FabianHofmann
Copy link
Contributor Author

close due to other, more promising approaches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants