Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the ordering of name and relative_path #109

Open
stuartmcalpine opened this issue Apr 2, 2024 · 2 comments
Open

Change the ordering of name and relative_path #109

stuartmcalpine opened this issue Apr 2, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@stuartmcalpine
Copy link
Collaborator

Currently relative_path is mandatory, and a name is not (it's automatically generated from the relative_path if not provided.

It is potentially more user friendly the other way round, where many users may not care about the data structure within the registry, i.e., they don't care about the relative_path. I think often people will just choose any relative path as they have to enter something.

Propose changing the name to be mandatory, and the relative_path is automatically generated (if not provided). The automated relative_path may then be more meaningful in the long run for finding data (could be something along the lines of <name>_<version>_<date>).

@stuartmcalpine stuartmcalpine added the enhancement New feature or request label Apr 2, 2024
@stuartmcalpine stuartmcalpine self-assigned this Apr 2, 2024
@JoanneBogart
Copy link
Collaborator

JoanneBogart commented Apr 12, 2024

I'm more or less convinced, except in case the data are not to be copied (old_location not used), relative_path must still be mandatory (and in that case name could still be derived as it currently is, but maybe it's better to require name in any case to keep the interface from being too confusing).

No, never mind that last part: there doesn't need to be any special treatment for name when old_location is None. But relative_path should be required.

@JoanneBogart
Copy link
Collaborator

JoanneBogart commented May 17, 2024

Concerning overwritable - I can image two ways someone might want to use it:

  1. while debugging keep creating a dataset with identical name, version, version suffix (if any) and relative path until you're satisfied with it. But we've made this impossible. It conflicts with uniqueness of (owner_type, owner, name, version, version_suffix) combined with our practice of always making a new db entry, even when the old dataset (file(s)) is overwritten.
  2. while debugging keep creating a dataset with identical name and relative path, but incrementing version.

If we recommend specifying name, and not necessarily relative_path, that's what people will be inclined to do, especially since it's usually less effort to think up just a name. This is even more true for people doing development, who are the ones most likely to want to overwrite and also the most likely to want to specify a minimum of parameters.

Regardless of what other changes we make or don't make, I think we should not allow overwriting a dataset with a different name.

We could handle 2. as follows: If relative_path is not specified, we could look for registered datasets whose version string matches except for patch number, or maybe patch number and version suffix. Find the one with greatest patch number which is overwritable and use its relative_path for the new dataset. (Or maybe only use it if the dataset with the largest patch number is overwritable.) Otherwise generate relative_path as usual from name, version and version suffix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants