Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc improvement suggestion #195

Open
OverLordGoldDragon opened this issue Sep 17, 2019 · 4 comments
Open

Doc improvement suggestion #195

OverLordGoldDragon opened this issue Sep 17, 2019 · 4 comments

Comments

@OverLordGoldDragon
Copy link

Reading through the API essentials, the functionalities are fairly well documented - except for how HyperparemeterHunter (HH) optimizes hyperparameters. Reading enough, I figured out the basics, and I'm sure the entire API is well-figurable - but the idea is, the "hunting" aspect of HH isn't as 'emphasized' or promptly explained. The two questions I found answers the last to are ones I sought out from the beginning:

  • How to specify which hyperparameters to optimize?
  • How to specify the search range?

E.g., the Keras example imports Real - but there's no way to tell what "Real" does without reading the docs; a more intuitive name would be RealSearchRange, or from search_range import Real - else I figure it's a form of type casting. -- I intend on learning the API further, but currently my question is: I use my own training loop class, which takes care of the following:

  • Training, via train_on_batch
  • Validation, via predict (using outputs to programmatically compute F1-score, loss, etc)
  • Data pipeline - all data preprocessed, and shuffled at each epoch
  • Checkpointing/logging - best model per F1-score, logging history, etc

Is it possible to set up HH to only do hyperparameter search? I don't mind its other functionalities, so long as they don't conflict with those of my own

@OverLordGoldDragon
Copy link
Author

OverLordGoldDragon commented Sep 17, 2019

P.S., I created a PR with comment edits that I'd find helpful when starting out with HH, as an example.

Also, the Feature Engineering section contains a verbatim duplicate of a large portion of text, almost consecutively - unsure if intended; image excerpt

@HunterMcGushion
Copy link
Owner

Thanks for opening this issue! You make an excellent point, and I think it'd be a good idea to add a section in the README to clarify that search ranges are specified with Real, Integer and Categorical. For now, I'd like to keep the names as-is, unless others also feel that the naming is confusing. I think that in the space of hyperparameter optimization, the names are fairly appropriate, especially since they were taken verbatim from the Scikit-Optimize project. Also, I'd prefer to keep the names shorter. However, I really appreciate your feedback on the names, and I'm certainly open to further discussion if you still disagree. I think additional documentation (like you added in PR #196) would go a long way--especially if it's in a prominent place like the README.

I'll keep discussion on PR #196 in its comment section.

Turning to your question on using HH to only do hyperparameter search, could you elaborate a bit? Are you trying to use the OptPros without relying on Environment and CVExperiment, and bypassing the automatic Experiment result matching that goes along with them? Or do you simply want to add some custom functionality as with lambda_callback? I apologize if I'm missing your point.

@HunterMcGushion
Copy link
Owner

Ah thank you for catching the duplicated text in the "Feature Engineering" docs. It looks like Sphinx is copying the FeatureEngineer.__init__ docstring and displaying it for both the FeatureEngineer class and for the FeatureEngineer.__init__ method. Definitely not intentional, and I'll look into that!

@OverLordGoldDragon
Copy link
Author

OverLordGoldDragon commented Sep 17, 2019

Regarding the naming, I'd say at least one of the two would prove quite helpful: (1) comments on examples (as in the PR); (2) importing from a submodule - i.e. e.g., from search_space import Real - to indicate that 'Real' is related to hyperparameter search rather than type-casting.

To clarify my question - in essence, I already took care of every aspect of training, and wish to use HH only for hyperparameter search - as a sort-of 'drop-in' addon. In pseudo-code,

while epochs < 5:
   while times_trained < trains_before_val:
       x, y = get_train_data() # gets 'next' data, like a generator - but isn't a generator
       train_loss = model.train_on_batch(x,y)
       times_trained += 1
   do_validation()

def do_validation():
    x, y = get_val_data()
    preds = model.predict(x)
    val_loss = loss_fn(y,preds)

Is there anywhere I can 'insert' HH above to do hyperparameter search?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants