Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value error when loading saved DIEN model #511

Open
jefflao opened this issue Feb 20, 2023 · 0 comments
Open

Value error when loading saved DIEN model #511

jefflao opened this issue Feb 20, 2023 · 0 comments

Comments

@jefflao
Copy link

jefflao commented Feb 20, 2023

Describe the bug(问题描述)

I am training the DIEN model on a dataset with around 20 categorical features and 5 user behavior columns, all being strings. I was able to save the model with keras.save_model in .h5 format, but it throws the following error when I try to load the model with keras.load_model:

  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/utils/generic_utils.py", line 668, in deserialize_keras_object
    deserialized_obj = cls.from_config(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/functional.py", line 670, in from_config
    input_tensors, output_tensors, created_layers = reconstruct_from_config(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/functional.py", line 1298, in reconstruct_from_config
    process_node(layer, node_data)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/functional.py", line 1244, in process_node
    output_tensors = layer(input_tensors, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer_v1.py", line 764, in __call__
    self._maybe_build(inputs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer_v1.py", line 2086, in _maybe_build
    self.build(input_shapes)
  File "/usr/local/lib/python3.8/dist-packages/deepctr/layers/sequence.py", line 255, in build
    raise ValueError('A `AttentionSequencePoolingLayer` layer requires '
ValueError: A `AttentionSequencePoolingLayer` layer requires inputs of a 3 tensor with shape (None,1,embedding_size),(None,T,embedding_size) and (None,1) Got different shapes: [TensorShape([None, 15, 35]), TensorShape([None, 1, 35]), TensorShape([None, 1])]

This seems to be an issue in the model reconstruction when calling load_model(). More details in the additional content section.

To Reproduce(复现步骤)

Model:

    model = DIEN(
        feature_columns,
        behavior_feat_list,
        dnn_hidden_units=[256, 128, 64],
        dnn_dropout=0.5,
        gru_type='AUGRU',
        use_negsampling=False,
        att_activation='sigmoid',
    )
    model.compile(Adam(learning_rate=1e-5), 'binary_crossentropy', metrics=['binary_crossentropy'])

Train and save model:

    history = model.fit(train_inputs,
                        'click',
                        verbose=True,
                        epochs=1,
                        batch_size=32,
                        validation_split='\t'
            )
    save_model(
        model,
        'dien.h5',
        save_format='h5',
    )

Load model (the part that raises the exception):

    from deepctr.layers import custom_objects
    loaded_model = load_model('dien.h5', custom_objects)

Operating environment(运行环境):

  • python version: 3.8
  • tensorflow version: 2.2-2.5. Encounter compatibility from numpy and TF for TF version >= 2.6
  • deepctr version: 0.9.3
  • CUDA version: 11.7
  • NVIDIA driver version: 515.65.01
  • base docker image: tensorflow/tensorflow-2.5.1-gpu

Additional context

I could not try tensorflow older than 2.2 due to driver compatibility issues. DeepCTR also doesn't work with 2.6 <= TF <= 2.11.

My model has the following structure (from `model.summary()):

genre (InputLayer)              [(None, 1)]          0                                            
__________________________________________________________________________________________________
hist_genre (InputLayer)         [(None, 15)]         0                                            
__________________________________________________________________________________________________
...
hash_28 (Hash)                  (None, 1)            0           genre[0][0]                      
__________________________________________________________________________________________________
hash_15 (Hash)                  (None, 1)            0           genre[0][0]                      
__________________________________________________________________________________________________
hash_3 (Hash)                   (None, 15)           0           hist_genre[0][0]                 
__________________________________________________________________________________________________
...
sparse_seq_emb_hist_genre (Embe multiple             404         hash_3[0][0]                     
                                                                 hash_15[0][0]                    
                                                                 hash_28[0][0]                    
__________________________________________________________________________________________________
concat (Concat)                 (None, 15, 35)       0           sparse_seq_emb_hist_category[0][0
                                                                 sparse_seq_emb_hist_channel[0][0]
                                                                 sparse_seq_emb_hist_episode[0][0]
                                                                 sparse_seq_emb_hist_genre[0][0]  
                                                                 sparse_seq_emb_hist_part[0][0]   
                                                                 sparse_seq_emb_hist_feature0[0][
__________________________________________________________________________________________________
seq_length (InputLayer)         [(None, 1)]          0    

__________________________________________________________________________________________________
gru1 (DynamicGRU)               (None, 15, 35)       7455        concat[0][0]                     
                                                                 seq_length[0][0]                 
__________________________________________________________________________________________________
concat_2 (Concat)               (None, 1, 35)        0           sparse_seq_emb_hist_category[2][0
                                                                 sparse_seq_emb_hist_channel[2][0]
                                                                 sparse_seq_emb_hist_episode[2][0]
                                                                 sparse_seq_emb_hist_genre[2][0]  
                                                                 sparse_seq_emb_hist_part[2][0]   
                                                                 sparse_seq_emb_hist_feature0[2][
__________________________________________________________________________________________________
attention_sequence_pooling_laye (None, 1, 15)        10081       concat_2[0][0]                   
                                                                 gru1[0][0]                       
                                                                 seq_length[0][0]              
...

Following the stack track and a lot of extra debug messages, I believe load_model does not pass the inputs to embedding layers in the same order as the original model when reconstructing the model. Namely in tensorflow/python/keras/engine/functional.py , reconstruct_from_config(config, custom_objects, created_layers) builds the layers whenever all its inputs is ready. As a result, the outputs of the embedding layers in reconstructed model such as sparse_seq_emb_hist_genre could end up having the embedded historical behavior sequence (of shape (None, 15)) before the embedded sparse feature (of shape (None, 1)), i.e. output[0] is the embedded behavior sequence instead of output[1].

Multiple hash layers for the same input are also created when the model initializes the key embedding and query embedding for the attention layer due to a lack of sharing mechanism. This likely does not create a real issue as the two hashes should be identical.

I was able to make a work around by changing the order of the embedding look-up initialization in the dien model deepctr/models/sequence/dien.py:

    keys_emb_list = embedding_lookup(embedding_dict, features, history_feature_columns,
                                     return_feat_list=history_fc_names, to_list=True)
    dnn_input_emb_list = embedding_lookup(embedding_dict, features, sparse_feature_columns,
                                          mask_feat_list=history_feature_list, to_list=True)
    # Move query embeddings from the first being initialized to the last.
    query_emb_list = embedding_lookup(embedding_dict, features, sparse_feature_columns,
                                      return_feat_list=history_feature_list, to_list=True)

This modification is definitely not safe. Please let me know if anyone has a better solution. Thank you in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant