Optuna Hyperband Algorithm Not Following Expected Model Training Scheme. #5380

d-sutariya · 2024-03-31T12:42:32Z

d-sutariya
Mar 31, 2024

Don't use GitHub Issues to ask support questions.

No response

nzw0301 · 2024-03-31T13:00:44Z

nzw0301
Mar 31, 2024
Maintainer

Could you elaborate your question?

0 replies

d-sutariya · 2024-04-01T04:04:57Z

d-sutariya
Apr 1, 2024
Author

Thank you for your quick reply.
I am new in github so i didn't understand where to put my question .
So i put it on Issue.
Sorry for that mistake from my side.

Expected behavior

I have observed an issue while using the Hyperband algorithm in Optuna. According to the Hyperband algorithm, when min_resources = 5, max_resources = 20, and reduction_factor = 2, the search should start with an initial space of 4 models for bracket 1, with each model receiving 5 epochs in the first round. Subsequently, the number of models is reduced by a factor of 2 in each round and search space should also reduced by factor of 2 for next brackets i.e bracket 2 will have initial search space of 2 models, and the number of epochs for the remaining models is doubled in each subsequent round. so total models should be 11 is expected .

link of the article:- https://arxiv.org/pdf/1603.06560.pdf

Environment

Optuna version:
Python version:
OS:
(Optional) Other libraries and their versions:

Error messages, stack traces, or logs

1/4 ━━━━━━━━━━━━━━━━━━━━ 16s 6s/step - accuracy: 0.4062 - loss: 0.7439 - val_auc: 0.5824

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1711943027.637560      85 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
W0000 00:00:1711943027.654974      85 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update

4/4 ━━━━━━━━━━━━━━━━━━━━ 9s 1s/step - accuracy: 0.4373 - loss: 0.7521 - val_auc: 0.5028 - val_accuracy: 0.6000 - val_loss: 0.6930 - val_val_auc: 0.4267

[I 2024-04-01 03:43:50,809] Trial 0 finished with value: 0.48208534717559814 and parameters: {'unit_input': 25, 'num_layers': 3, 'num_layer_0': 22, 'activation_layer_0': 'selu', 'dropout_layer_0': False, 'num_layer_1': 29, 'activation_layer_1': 'tanh', 'dropout_layer_1': True, 'num_layer_2': 29, 'activation_layer_2': 'relu', 'dropout_layer_2': False, 'optimizer': 'adam'}. Best is trial 0 with value: 0.48208534717559814.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 864ms/step - accuracy: 0.5300 - loss: 0.7351 - val_auc: 0.5486 - val_accuracy: 0.4500 - val_loss: 0.7026 - val_val_auc: 0.3067

[I 2024-04-01 03:43:56,640] Trial 1 finished with value: 0.5201288461685181 and parameters: {'unit_input': 26, 'num_layers': 2, 'num_layer_0': 26, 'activation_layer_0': 'relu', 'dropout_layer_0': True, 'num_layer_1': 28, 'activation_layer_1': 'selu', 'dropout_layer_1': False, 'optimizer': 'rmsprop'}. Best is trial 1 with value: 0.5201288461685181.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 10s 1s/step - accuracy: 0.4687 - loss: 0.8300 - val_auc: 0.4991 - val_accuracy: 0.7500 - val_loss: 0.5893 - val_val_auc: 0.7133

[I 2024-04-01 03:44:06,376] Trial 2 finished with value: 0.46799516677856445 and parameters: {'unit_input': 26, 'num_layers': 2, 'num_layer_0': 22, 'activation_layer_0': 'selu', 'dropout_layer_0': True, 'num_layer_1': 21, 'activation_layer_1': 'relu', 'dropout_layer_1': True, 'optimizer': 'adam'}. Best is trial 1 with value: 0.5201288461685181.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 4s 580ms/step - accuracy: 0.4979 - loss: 0.6878 - val_auc: 0.5126 - val_accuracy: 0.4000 - val_loss: 0.7848 - val_val_auc: 0.5067

[I 2024-04-01 03:44:10,578] Trial 3 finished with value: 0.499194860458374 and parameters: {'unit_input': 27, 'num_layers': 3, 'num_layer_0': 24, 'activation_layer_0': 'relu', 'dropout_layer_0': False, 'num_layer_1': 29, 'activation_layer_1': 'tanh', 'dropout_layer_1': False, 'num_layer_2': 21, 'activation_layer_2': 'tanh', 'dropout_layer_2': False, 'optimizer': 'rmsprop'}. Best is trial 1 with value: 0.5201288461685181.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 8s 1s/step - accuracy: 0.5803 - loss: 0.6995 - val_auc: 0.5842 - val_accuracy: 0.2500 - val_loss: 0.7780 - val_val_auc: 0.2733

[I 2024-04-01 03:44:18,894] Trial 4 finished with value: 0.5750805139541626 and parameters: {'unit_input': 28, 'num_layers': 2, 'num_layer_0': 30, 'activation_layer_0': 'relu', 'dropout_layer_0': True, 'num_layer_1': 24, 'activation_layer_1': 'tanh', 'dropout_layer_1': True, 'optimizer': 'rmsprop'}. Best is trial 4 with value: 0.5750805139541626.

auc_key is val_auc
prune or not:-False
1/4 ━━━━━━━━━━━━━━━━━━━━ 5s 2s/step - accuracy: 0.4688 - loss: 0.6974 - val_auc: 0.5156

W0000 00:00:1711943060.948183      85 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update

4/4 ━━━━━━━━━━━━━━━━━━━━ 3s 492ms/step - accuracy: 0.4305 - loss: 0.7048 - val_auc: 0.4396 - val_accuracy: 0.2500 - val_loss: 0.8237 - val_val_auc: 0.5267

[I 2024-04-01 03:44:22,435] Trial 5 finished with value: 0.4200885593891144 and parameters: {'unit_input': 25, 'num_layers': 2, 'num_layer_0': 26, 'activation_layer_0': 'tanh', 'dropout_layer_0': False, 'num_layer_1': 30, 'activation_layer_1': 'tanh', 'dropout_layer_1': False, 'optimizer': 'rmsprop'}. Best is trial 4 with value: 0.5750805139541626.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 9s 1s/step - accuracy: 0.4243 - loss: 0.8074 - val_auc: 0.4023 - val_accuracy: 0.3000 - val_loss: 0.7245 - val_val_auc: 0.5467

[I 2024-04-01 03:44:31,105] Trial 6 finished with value: 0.4055958092212677 and parameters: {'unit_input': 26, 'num_layers': 3, 'num_layer_0': 29, 'activation_layer_0': 'relu', 'dropout_layer_0': True, 'num_layer_1': 24, 'activation_layer_1': 'relu', 'dropout_layer_1': False, 'num_layer_2': 21, 'activation_layer_2': 'selu', 'dropout_layer_2': True, 'optimizer': 'rmsprop'}. Best is trial 4 with value: 0.5750805139541626.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 865ms/step - accuracy: 0.5810 - loss: 0.6902 - val_auc: 0.5322 - val_accuracy: 0.5000 - val_loss: 0.7325 - val_val_auc: 0.4733

[I 2024-04-01 03:44:36,734] Trial 7 finished with value: 0.5688406229019165 and parameters: {'unit_input': 26, 'num_layers': 2, 'num_layer_0': 28, 'activation_layer_0': 'tanh', 'dropout_layer_0': True, 'num_layer_1': 29, 'activation_layer_1': 'tanh', 'dropout_layer_1': False, 'optimizer': 'rmsprop'}. Best is trial 4 with value: 0.5750805139541626.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 5s 580ms/step - accuracy: 0.5080 - loss: 0.6929 - val_auc: 0.5632 - val_accuracy: 0.2500 - val_loss: 0.7570 - val_val_auc: 0.4267

[I 2024-04-01 03:44:41,691] Trial 8 finished with value: 0.6052737832069397 and parameters: {'unit_input': 29, 'num_layers': 2, 'num_layer_0': 27, 'activation_layer_0': 'relu', 'dropout_layer_0': False, 'num_layer_1': 26, 'activation_layer_1': 'relu', 'dropout_layer_1': False, 'optimizer': 'adam'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 4s 579ms/step - accuracy: 0.4871 - loss: 0.7605 - val_auc: 0.5306 - val_accuracy: 0.4500 - val_loss: 0.7114 - val_val_auc: 0.5733

[I 2024-04-01 03:44:45,563] Trial 9 finished with value: 0.5758856534957886 and parameters: {'unit_input': 21, 'num_layers': 2, 'num_layer_0': 29, 'activation_layer_0': 'selu', 'dropout_layer_0': False, 'num_layer_1': 22, 'activation_layer_1': 'relu', 'dropout_layer_1': False, 'optimizer': 'rmsprop'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 9s 1s/step - accuracy: 0.4210 - loss: 0.8604 - val_auc: 0.4072 - val_accuracy: 0.6000 - val_loss: 0.6676 - val_val_auc: 0.4667

[I 2024-04-01 03:44:54,314] Trial 10 finished with value: 0.43196457624435425 and parameters: {'unit_input': 21, 'num_layers': 3, 'num_layer_0': 22, 'activation_layer_0': 'selu', 'dropout_layer_0': True, 'num_layer_1': 20, 'activation_layer_1': 'selu', 'dropout_layer_1': False, 'num_layer_2': 28, 'activation_layer_2': 'relu', 'dropout_layer_2': True, 'optimizer': 'rmsprop'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
1/4 ━━━━━━━━━━━━━━━━━━━━ 12s 4s/step - accuracy: 0.5000 - loss: 0.9697 - val_auc: 0.4141

W0000 00:00:1711943098.520198      87 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update

4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 1s/step - accuracy: 0.4449 - loss: 0.9636 - val_auc: 0.4353 - val_accuracy: 0.6000 - val_loss: 0.6684 - val_val_auc: 0.6467

[I 2024-04-01 03:45:01,538] Trial 11 finished with value: 0.4450483024120331 and parameters: {'unit_input': 25, 'num_layers': 3, 'num_layer_0': 27, 'activation_layer_0': 'selu', 'dropout_layer_0': True, 'num_layer_1': 20, 'activation_layer_1': 'selu', 'dropout_layer_1': False, 'num_layer_2': 20, 'activation_layer_2': 'selu', 'dropout_layer_2': False, 'optimizer': 'adam'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 998ms/step - accuracy: 0.5360 - loss: 0.7156 - val_auc: 0.5210 - val_accuracy: 0.2500 - val_loss: 0.7807 - val_val_auc: 0.4867

[I 2024-04-01 03:45:07,925] Trial 12 finished with value: 0.5235506892204285 and parameters: {'unit_input': 21, 'num_layers': 3, 'num_layer_0': 22, 'activation_layer_0': 'selu', 'dropout_layer_0': False, 'num_layer_1': 23, 'activation_layer_1': 'selu', 'dropout_layer_1': False, 'num_layer_2': 20, 'activation_layer_2': 'relu', 'dropout_layer_2': True, 'optimizer': 'rmsprop'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 923ms/step - accuracy: 0.5082 - loss: 0.7771 - val_auc: 0.3970 - val_accuracy: 0.2500 - val_loss: 0.7339 - val_val_auc: 0.4400

[I 2024-04-01 03:45:13,938] Trial 13 finished with value: 0.37560388445854187 and parameters: {'unit_input': 30, 'num_layers': 2, 'num_layer_0': 28, 'activation_layer_0': 'tanh', 'dropout_layer_0': False, 'num_layer_1': 20, 'activation_layer_1': 'relu', 'dropout_layer_1': True, 'optimizer': 'rmsprop'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
1/4 ━━━━━━━━━━━━━━━━━━━━ 24s 8s/step - accuracy: 0.4062 - loss: 1.2144 - val_auc: 0.3138

W0000 00:00:1711943122.104848      84 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update

4/4 ━━━━━━━━━━━━━━━━━━━━ 14s 2s/step - accuracy: 0.4137 - loss: 1.1424 - val_auc: 0.3553 - val_accuracy: 0.5500 - val_loss: 0.6868 - val_val_auc: 0.6133

[I 2024-04-01 03:45:27,596] Trial 14 finished with value: 0.37620770931243896 and parameters: {'unit_input': 25, 'num_layers': 3, 'num_layer_0': 27, 'activation_layer_0': 'tanh', 'dropout_layer_0': True, 'num_layer_1': 23, 'activation_layer_1': 'tanh', 'dropout_layer_1': True, 'num_layer_2': 21, 'activation_layer_2': 'selu', 'dropout_layer_2': True, 'optimizer': 'adam'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 4s 526ms/step - accuracy: 0.4569 - loss: 0.8955 - val_auc: 0.4505 - val_accuracy: 0.7500 - val_loss: 0.5849 - val_val_auc: 0.4867

[I 2024-04-01 03:45:31,656] Trial 15 finished with value: 0.4549114406108856 and parameters: {'unit_input': 29, 'num_layers': 2, 'num_layer_0': 26, 'activation_layer_0': 'relu', 'dropout_layer_0': False, 'num_layer_1': 29, 'activation_layer_1': 'selu', 'dropout_layer_1': False, 'optimizer': 'adam'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 9s 1s/step - accuracy: 0.4303 - loss: 0.8842 - val_auc: 0.4923 - val_accuracy: 0.7500 - val_loss: 0.5777 - val_val_auc: 0.6133

[I 2024-04-01 03:45:41,125] Trial 16 finished with value: 0.450483113527298 and parameters: {'unit_input': 24, 'num_layers': 2, 'num_layer_0': 22, 'activation_layer_0': 'selu', 'dropout_layer_0': True, 'num_layer_1': 23, 'activation_layer_1': 'tanh', 'dropout_layer_1': True, 'optimizer': 'adam'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 5s 636ms/step - accuracy: 0.4477 - loss: 0.7139 - val_auc: 0.4822 - val_accuracy: 0.5000 - val_loss: 0.6979 - val_val_auc: 0.5600

[I 2024-04-01 03:45:46,477] Trial 17 finished with value: 0.4949677884578705 and parameters: {'unit_input': 29, 'num_layers': 3, 'num_layer_0': 30, 'activation_layer_0': 'relu', 'dropout_layer_0': False, 'num_layer_1': 27, 'activation_layer_1': 'tanh', 'dropout_layer_1': False, 'num_layer_2': 20, 'activation_layer_2': 'tanh', 'dropout_layer_2': False, 'optimizer': 'adam'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 5s 807ms/step - accuracy: 0.4477 - loss: 0.8181 - val_auc: 0.4282 - val_accuracy: 0.3000 - val_loss: 0.8714 - val_val_auc: 0.6000

[I 2024-04-01 03:45:51,964] Trial 18 finished with value: 0.4108293056488037 and parameters: {'unit_input': 20, 'num_layers': 2, 'num_layer_0': 28, 'activation_layer_0': 'relu', 'dropout_layer_0': False, 'num_layer_1': 20, 'activation_layer_1': 'selu', 'dropout_layer_1': True, 'optimizer': 'rmsprop'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
1/4 ━━━━━━━━━━━━━━━━━━━━ 12s 4s/step - accuracy: 0.3750 - loss: 0.8735 - val_auc: 0.4157

W0000 00:00:1711943156.175617      86 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update

4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 920ms/step - accuracy: 0.4403 - loss: 0.8202 - val_auc: 0.4912 - val_accuracy: 0.6000 - val_loss: 0.6847 - val_val_auc: 0.3267

[I 2024-04-01 03:45:58,947] Trial 19 finished with value: 0.5179147124290466 and parameters: {'unit_input': 25, 'num_layers': 3, 'num_layer_0': 25, 'activation_layer_0': 'selu', 'dropout_layer_0': False, 'num_layer_1': 22, 'activation_layer_1': 'tanh', 'dropout_layer_1': False, 'num_layer_2': 23, 'activation_layer_2': 'selu', 'dropout_layer_2': True, 'optimizer': 'adam'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-False
4/4 ━━━━━━━━━━━━━━━━━━━━ 4s 550ms/step - accuracy: 0.5401 - loss: 0.6980 - val_auc: 0.4525 - val_accuracy: 0.3000 - val_loss: 0.7103 - val_val_auc: 0.4467

[I 2024-04-01 03:46:03,351] Trial 20 finished with value: 0.43679550290107727 and parameters: {'unit_input': 22, 'num_layers': 2, 'num_layer_0': 27, 'activation_layer_0': 'tanh', 'dropout_layer_0': False, 'num_layer_1': 27, 'activation_layer_1': 'relu', 'dropout_layer_1': False, 'optimizer': 'rmsprop'}. Best is trial 8 with value: 0.6052737832069397.

auc_key is val_auc
prune or not:-false

Steps to reproduce

import optuna
import numpy as np
import pandas as pd 
from tensorflow.keras.layers import Dense,Flatten,Dropout
import tensorflow as tf
from tensorflow.keras.models import Sequential


# Toy dataset generation
def generate_toy_dataset():
    np.random.seed(0)
    X_train = np.random.rand(100, 10)
    y_train = np.random.randint(0, 2, size=(100,))
    X_val = np.random.rand(20, 10)
    y_val = np.random.randint(0, 2, size=(20,))
    return X_train, y_train, X_val, y_val

X_train, y_train, X_val, y_val = generate_toy_dataset()

# Model building function
def build_model(trial):
    model = Sequential()
    model.add(Dense(units=trial.suggest_int('unit_input', 20, 30),
                    activation='selu',
                    input_shape=(X_train.shape[1],)))

    num_layers = trial.suggest_int('num_layers', 2, 3)
    for i in range(num_layers):
        units = trial.suggest_int(f'num_layer_{i}', 20, 30)
        activation = trial.suggest_categorical(f'activation_layer_{i}', ['relu', 'selu', 'tanh'])
        model.add(Dense(units=units, activation=activation))
        if trial.suggest_categorical(f'dropout_layer_{i}', [True, False]):
            model.add(Dropout(rate=0.5))

    model.add(Dense(1, activation='sigmoid'))

    optimizer_name = trial.suggest_categorical('optimizer', ['adam', 'rmsprop'])
    if optimizer_name == 'adam':
        optimizer = tf.keras.optimizers.Adam()
    else:
        optimizer = tf.keras.optimizers.RMSprop()

    model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy', tf.keras.metrics.AUC(name='val_auc')])

    return model

def objective(trial):
    model = build_model(trial)
    # Assuming you have your data prepared
    # Modify the fit method to include AUC metric
    history = model.fit(X_train, y_train, validation_data=(X_val, y_val), verbose=1)
    
    # Check if 'val_auc' is recorded
    auc_key = None
    for key in history.history.keys():
        if key.startswith('val_auc'):
            auc_key = key
            print(f"auc_key is {auc_key}")
            break
    
    if auc_key is None:
        raise ValueError("AUC metric not found in history. Make sure it's being recorded during training.")
    
    # Report validation AUC for each model
    
    if auc_key =="val_auc":
        step=0
    else:
        step = int(auc_key.split('_')[-1])
    
    auc_value=history.history[auc_key][0]
    trial.report(auc_value, step=step)
    print(f"prune or not:-{trial.should_prune()}")
    if trial.should_prune():
        raise optuna.TrialPruned()

    return history.history[auc_key]

# Optuna study creation
study = optuna.create_study(
    direction='maximize',
    pruner=optuna.pruners.HyperbandPruner(
        min_resource=5,
        max_resource=20,
        reduction_factor=2
    )
)

# Start optimization
study.optimize(objective)

Additional context (optional)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optuna Hyperband Algorithm Not Following Expected Model Training Scheme. #5380

{{title}}

Replies: 2 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Optuna Hyperband Algorithm Not Following Expected Model Training Scheme. #5380

d-sutariya Mar 31, 2024

Don't use GitHub Issues to ask support questions.

Replies: 2 comments

nzw0301 Mar 31, 2024 Maintainer

d-sutariya Apr 1, 2024 Author

Expected behavior

Environment

Error messages, stack traces, or logs

Steps to reproduce

Additional context (optional)

d-sutariya
Mar 31, 2024

nzw0301
Mar 31, 2024
Maintainer

d-sutariya
Apr 1, 2024
Author