Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modification to run with multi-output models #154

Closed
JoshuaMitton opened this issue Dec 14, 2018 · 15 comments
Closed

Modification to run with multi-output models #154

JoshuaMitton opened this issue Dec 14, 2018 · 15 comments
Assignees
Labels
topic: data Has to do with input or output data

Comments

@JoshuaMitton
Copy link

It would be great if possible to allow the option of using multi-output Keras models.

It is currently stated in Scan.py that y should have shape (m, c) where c is the number of classes. This restricts to one output parameter and inputting a multi-output model yields the following error:
~/.local/lib/python3.6/site-packages/talos/scan/scan_prepare.py in scan_prepare(self)
57
58 # create the data asset
---> 59 self.y_max = self.y.max()
60 self = validation_split(self)
61 self.shape = classify(self.y)

AttributeError: 'list' object has no attribute 'max'

For example the model I am currently working with outputs Age, Species, and Status where each has a different value of c in the shape (m, c).

@JoshuaMitton
Copy link
Author

I have updated to the latest development version of Talos and this issue still persists.

The output to my model is as follows:
` xAge = Dense(name = 'age', units = 17,
activation = 'softmax',
kernel_regularizer = l2(params['regage']),
kernel_initializer = 'he_normal')(xd)
xSpecies = Dense(name ='species', units = 3,
activation = 'softmax',
kernel_regularizer = l2(params['regspecies']),
kernel_initializer = 'he_normal')(xd)
xStatus = Dense(name='status', units = 3,
activation = 'softmax',
kernel_regularizer = l2(params['regstatus']),
kernel_initializer = 'he_normal')(xd)

outputs = []
for i in ['xAge', 'xSpecies', 'xStatus']:
    outputs.append(locals()[i])
model = Model(inputs = input_vec, outputs = outputs)`

From the error message is appears that Talos looks for a max value of the y data. Given that this is a list of 3 numpy arrays it doesn't have a max and errors.

Can Talos be updated to handle multi-input / multi-output models? Or is my understanding oh how to set this up incorrect?

I am calling Talos like this (also my model runs fine when not using Talos):
h = ta.Scan(x=X_train, y=y_train, params=params, model = CNN_model, x_val=X_val, y_val=y_val, dataset_name='mosquito_CNN', experiment_no=str(fold), grid_downsample=0.01)

@mikkokotila
Copy link
Contributor

can you share the CNN_model you are using as input for Scan()

@JoshuaMitton
Copy link
Author

JoshuaMitton commented Dec 19, 2018

Yes it is as follows:

`def CNN_model(X_train, y_train, X_val, y_val, params):

input_vec = Input(name='input', shape=(1650,1))

xd = Conv1D(name='Conv_1', filters=params['filter_1'],
 kernel_size = params['kernel_1'], strides = params['stride_1'],
 activation='relu',
 kernel_regularizer=l2(params['regconv']), 
 kernel_initializer='he_normal')(input_vec)
xd = BatchNormalization(name='batchnorm_1')(xd)
xd = MaxPooling1D(pool_size=params['pool_1'])(xd)

xd = Conv1D(name='Conv_2', filters=params['filter_2'],
 kernel_size = params['kernel_2'], strides = params['stride_2'],
 activation='relu',
 kernel_regularizer=l2(params['regconv']), 
 kernel_initializer='he_normal')(xd)
xd = BatchNormalization(name='batchnorm_2')(xd)
xd = MaxPooling1D(pool_size=params['pool_2'])(xd)

xd = Conv1D(name='Conv_3', filters=params['filter_3'],
 kernel_size = params['kernel_3'], strides = params['stride_3'],
 activation='relu',
 kernel_regularizer=l2(params['regconv']), 
 kernel_initializer='he_normal')(xd)
xd = BatchNormalization(name='batchnorm_3')(xd)
xd = MaxPooling1D(pool_size=params['pool_3'])(xd)
                        
xd = Conv1D(name='Conv_4', filters=params['filter_4'],
 kernel_size = params['kernel_4'], strides = params['stride_4'],
 activation='relu',
 kernel_regularizer=l2(params['regconv']), 
 kernel_initializer='he_normal')(xd)
xd = BatchNormalization(name='batchnorm_4')(xd)
xd = MaxPooling1D(pool_size=params['pool_4'])(xd)

xd = Conv1D(name='Conv_5', filters=params['filter_5'],
 kernel_size = params['kernel_5'], strides = params['stride_5'],
 activation='relu',
 kernel_regularizer=l2(params['regconv']), 
 kernel_initializer='he_normal')(xd)
xd = BatchNormalization(name='batchnorm_5')(xd)
xd = MaxPooling1D(pool_size=params['pool_5'])(xd)

xd = Flatten()(xd)

xd = Dropout(name='dout_6', rate=params['dropout_rate'])(xd)
xd = Dense(name='d_6', units=params['dense_width_1'], activation='relu', 
 kernel_regularizer=l2(params['regdense']), 
 kernel_initializer='he_normal')(xd)
xd = BatchNormalization(name='batchnorm_6')(xd)  
    

xAge     = Dense(name = 'age', units = 17, 
                 activation = 'softmax', 
                 kernel_regularizer = l2(params['regage']), 
                 kernel_initializer = 'he_normal')(xd)
xSpecies = Dense(name ='species', units = 3, 
                 activation = 'softmax', 
                 kernel_regularizer = l2(params['regspecies']), 
                 kernel_initializer = 'he_normal')(xd)
xStatus  = Dense(name='status', units = 3, 
                 activation = 'softmax', 
                 kernel_regularizer = l2(params['regstatus']), 
                 kernel_initializer = 'he_normal')(xd)

outputs = []
for i in ['xAge', 'xSpecies', 'xStatus']:
    outputs.append(locals()[i])
model = Model(inputs = input_vec, outputs = outputs)

model.compile(loss=params['loss'], metrics=['acc'], 
              optimizer=params['optimizer'](lr=lr_normalizer(params['lr'], params['optimizer']), decay=params['decay'], momentum=params['momentum'], nesterov=True))
model.summary()

out = model.fit(x = X_train, 
                    y = y_train,
                    batch_size = 64, 
                    verbose = 0, 
                    epochs = 10,
                    validation_data = (X_val, y_val),
                    callbacks = [keras.callbacks.EarlyStopping(monitor='val_loss', 
                                patience=400, verbose=0, mode='auto'), 
                                CSVLogger(model_name+'.csv', append=True, separator=';')])

return out, model`

@mikkokotila mikkokotila self-assigned this Dec 20, 2018
@mikkokotila mikkokotila added the topic: data Has to do with input or output data label Dec 20, 2018
@mikkokotila
Copy link
Contributor

I think this needs to be handled together with #145 and #155.

@rarazac
Copy link

rarazac commented Jul 23, 2019

Any news on the support of multiple outputs?

@mikkokotila
Copy link
Contributor

@rarazac the good news is that this probably is the most likely next new feature candidate. I've changed the status label to correspond with the will.

@mikkokotila
Copy link
Contributor

To add small background, in my own research as well as that of our wider research group, implementing ethnical considerations is a key interest, and multi output models seem to be a good place to start.

@rarazac
Copy link

rarazac commented Jul 23, 2019

@mikkokotila Thank you very much for your response!

If the multiple output model uses the same labels, the following minimal example seems to work...

import talos as tl
import tensorflow as tf

def getModelForSweep(x_train, y_train, x_val, y_val, params):

    input_shape = (28, 28) # 28x28 pixels
    inp = tf.keras.layers.Input(shape=input_shape, name="main_input")

    # shared network
    shared_net = tf.keras.layers.Dense(params['shared_fc'], activation="relu")(inp)
    shared_network_out = tf.keras.layers.Flatten(name="shared_out")(shared_net)

    # out_1
    out_1 = tf.keras.layers.Dense(params['fc'])(shared_network_out)
    out_1 = tf.keras.layers.Dense(10, activation="softmax", name="out_1")(out_1)

    # out_2
    out_2 = tf.keras.layers.Dense(params['fc'])(shared_network_out)
    out_2 = tf.keras.layers.Dense(10, activation="softmax", name="out_2")(out_2)
    
    # build compile and train the model
    model = tf.keras.models.Model(inputs=inp, outputs=[out_1, out_2])
    model.compile(optimizer=params['optimizer'], 
                             metrics=['accuracy'],
                             loss={'out_1': tf.keras.losses.categorical_crossentropy,
                                   'out_2':tf.keras.losses.categorical_crossentropy})
    out = model.fit(x_train, [y_train, y_train], validation_split=0.1)

    return out, model

# parameters for sweep
p = {
    'lr': [0.01, 0.001, 0.0001],
    'shared_fc': [32, 64],
    'fc': [64, 128],
    'optimizer': ['Adam', 'sgd'],
}

# load data
(mnist_train, train_labels), (mnist_test, test_labels) = tf.keras.datasets.mnist.load_data()
train_labels_hot = tf.keras.utils.to_categorical(train_labels, num_classes=10)

# sweep
tl.Scan(x=mnist_train, y=train_labels_hot,
        params=p,
        model=getModelForSweep,
        grid_downsample=0.1, 
        experiment_no='1', 
        dataset_name='mnist')

@mikkokotila
Copy link
Contributor

Thanks. Given that this runs ok, do I gather it right that the issue is simply down to being able to pass two separate y in to talos.Scan()?

@mikkokotila
Copy link
Contributor

This is now implemented in daily-dev:

pip install git+https://github.com/autonomio/talos@daily-dev

Below is a working example for reference:

import talos as tl
import tensorflow as tf

def getModelForSweep(x_train, y_train, x_val, y_val, params):

    input_shape = (28, 28) # 28x28 pixels
    inp = tf.keras.layers.Input(shape=input_shape, name="main_input")

    # shared network
    shared_net = tf.keras.layers.Dense(params['shared_fc'], activation="relu")(inp)
    shared_network_out = tf.keras.layers.Flatten(name="shared_out")(shared_net)

    # out_1
    out_1 = tf.keras.layers.Dense(params['fc'])(shared_network_out)
    out_1 = tf.keras.layers.Dense(10, activation="softmax", name="out_1")(out_1)

    # out_2
    out_2 = tf.keras.layers.Dense(params['fc'])(shared_network_out)
    out_2 = tf.keras.layers.Dense(10, activation="softmax", name="out_2")(out_2)
    
    # build compile and train the model
    model = tf.keras.models.Model(inputs=inp, outputs=[out_1, out_2])
    model.compile(optimizer=params['optimizer'], 
                             metrics=['accuracy'],
                             loss={'out_1': tf.keras.losses.categorical_crossentropy,
                                   'out_2':tf.keras.losses.categorical_crossentropy})
    out = model.fit(x=x_train,
                    y=[y_train[0], y_train[1]],
                    validation_data=[x_val, [y_val[0], y_val[1]]])

    return out, model

# parameters for sweep
p = {
    'lr': [0.01, 0.001, 0.0001],
    'shared_fc': [32, 64],
    'fc': [64, 128],
    'optimizer': ['Adam', 'sgd'],
}

# load data
(mnist_train, train_labels), (mnist_test, test_labels) = tf.keras.datasets.mnist.load_data()
train_labels_hot = tf.keras.utils.to_categorical(train_labels, num_classes=10)

# sweep
tl.Scan(x=mnist_train, y=[train_labels_hot, train_labels_hot],
        x_val=mnist_train, y_val=[train_labels_hot, train_labels_hot],
        params=p,
        model=getModelForSweep,
        fraction_limit=0.1, 
        experiment_name='mnist', clear_session=False)

Thanks a lot for @rarazac for inputs and example model.

@mikkokotila
Copy link
Contributor

mikkokotila commented Jul 24, 2019

ONE IMPORTANT POINT TO KEEP IN MIND

You must explicitly pass validation data (as opposed to using the split argument). This is consistent with multi-input models, where also Talos expects passing validation data explicitly.

@sarmadm
Copy link

sarmadm commented Jul 24, 2019

IF I have a list as:

 X_train = [ f0_train_tw, f1_train_tw, f2_train_tw, f3_train_tw , f4_one_tr, f5_train_preds , f6_train]

 Y_train = [Y_oh_tr, Ymrs_train, Y_oh_tr]

and test

  X_test = [ f0_test_tw, f1_test_tw, f2_test_tw, f3_test_tw , f4_test_tr, f5_test_preds , f6_test]

 Y_train = [Y_oh_tr, Ymrs_train, Y_oh_tr]

is below is correct way to model.fit ?

           out = model.fit( x= X_train , y=Y_train,
                
                 epochs=params['epochs'],
                 batch_size=params['batch_size'],
              
                validation_data = [X_test ,Y_test],
                
                callbacks = [chkpt_r, chkpt_c, chkpt_r2c, reduceLR] , class_weight =class_weight)



return out , model





        import talos as ta

         ta.Scan(X_train, Y_train , p, build_mdl , x_val=X_test , y_val=Y_test )

When I run loss is nan

@mikkokotila
Copy link
Contributor

@sarmadm No, the above is not correct. Please see the provided example carefully and make effort to follow it. Also you can just copy paste the example and run on your machine, and then work from there to implement your own data and configuration in it.

@sarmadm
Copy link

sarmadm commented Jul 24, 2019

I changed it to :

    out= model.fit( x= X_train, y=Y_train,

        epochs=5,

       batch_size=params['batch_size'],

       validation_data = [X_test, Y_test],

        callbacks = [chkpt_r, chkpt_c, chkpt_r2c, reduceLR] , class_weight =class_weight)

It runs but when going from one epoch to another in params this error shows :

TypeError: can't pickle _thread.RLock objects

@mikkokotila
Copy link
Contributor

mikkokotila commented Jul 25, 2019

@sarmadm as I had said, what you have to do is run the example provided on your system, and then work from there. If you would like to have user support for specific case of yours, then you could try and open a new issue for that.

But first, you really have to look at the example carefully. Pay attention especially to this:

    out = model.fit(x=x_train,
                    y=[y_train[0], y_train[1]],
                    validation_data=[x_val, [y_val[0], y_val[1]]])

The case you are highlighting is clearly far from it, and will definitely cause errors.

Closing this here as the actual matter is resolved, and the feature implemented and tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: data Has to do with input or output data
Projects
None yet
Development

No branches or pull requests

4 participants