Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

further changes to the extraction pipeline #113

Open
LukasMut opened this issue Dec 6, 2022 · 0 comments
Open

further changes to the extraction pipeline #113

LukasMut opened this issue Dec 6, 2022 · 0 comments
Assignees
Labels
cleanup Deprecate or refactor code enhancement New feature or request

Comments

@LukasMut
Copy link
Collaborator

LukasMut commented Dec 6, 2022

There are a few changes I would like to see, myself, but this should ultimately be Lukas's decision.

I think we should ask for an out-folder and automatically name it with the layer the user is asking for. I find even myself, I've run code over multiple iterations and left the same out-path and it's overwritten features.npy and file_names.txt and I've needed to go back again. I think if we ask for a folder, then save in .../out_folder/module_name/features.npy it will be more useful, adding in a check to make the folder, if necessary, beforehand (Path(..).mkdir(parents=True, exist_ok=True) does this check easily enough). That way you could run the same command line code but only change module-name and it would instantly just put everything in the right place. That would be how I sort of expect this and would be easier for me, as I think a lot of people want features from multiple layers, and it's always easier to forget to make two changes to your calls, when only one is really necessary. When that happens, features.npy just gets overwritten and you have to run the whole thing again.

I also think out-path (or out-folder as I would want to call it) should be echoed back to the user after the extract-features call. Right now it just gives you the shape and a confirmation that the features were saved to disk.

So, only relatively small changes that are desirable but these could wait until the next release. I also wanted to re-echo the idea for an automatic lookup so that a relevant source can be selected based on the model name, so the user doesn't need to specify it. It's on us as people developing the library to work this out for the user (I think). If a researcher wants to use a specific model they've heard about, I think it should be sufficient to just input the model name (but advanced users can always select a specific source they want). Again, this isn't directly related to this CLI PR, but solving that would make the CLI experience a little easier.

All in all, at some point I would like to see a change in out-path if others agree (perhaps a good argument can be made for keeping it as is - that I haven't considered).

Originally posted by @Alxmrphi in #104 (review)

@LukasMut LukasMut changed the title There are a few changes I would like to see, myself, but this should ultimately be Lukas's decision. further changes to the extraction pipeline Dec 6, 2022
@LukasMut LukasMut added enhancement New feature or request cleanup Deprecate or refactor code labels Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Deprecate or refactor code enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants