Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

It seems that the fc layer of the moe type has not been implemented... #1893

Open
Roshanson opened this issue Jul 30, 2021 · 2 comments
Open

Comments

@Roshanson
Copy link

Description

It seems that the fc layer of the moe type has not been implemented...

In tensor2tensor/layers/common_attention.py:289 , I can't find the the fc layer of the moe type.

  cur_layers = dict(
      # Attention layers:
      a=multihead_attention_fn,  # Multihead full attention
      loc=local_attention_fn,  # Local attention
      locm=local_attention_masked_fn,  # Local attention (masked)
      red=compressed_attention_fn,  # Memory-compressed attention
      redm=compressed_attention_masked_fn,  # Memory-compressed att (masked)
      mem=memeff_attention_fn,  # Memory efficient
      # Feed-forward layers:
      fc=conv_hidden_relu,  # Fully connected
      sep=sep_conv_relu,  # Separable convolution (unmasked)
      sepm=sep_conv_relu_masked,  # Separable convolution (masked)
  )

Environment information

OS:  Linux
tensor2tensor 1.14.1
$ python -3.6.5

Error logs

KeyError: 
"in converted code:\n relative to   tensor2tensor:\n\n utils\\t2t_model.py:326 call\n sharded_logits, losses = self.model_fn_sharded(sharded_features)\n utils\\t2t_model.py:374 model_fn_sharded\n self._to_single_features_dict(transformed_features))\n models\\research\\transformer_moe.py:172 body_sharded\n x = prepostprocess(layers[ff_type])(\n\n KeyError: 'moe'\n"

Steps to reproduce:

set:
 FLAGS.model = "transformer_moe"  
 FLAGS.hparams_set = "transformer_moe_2k" 

then start train
@ferdiko
Copy link

ferdiko commented Oct 7, 2021

Are there any updates on this?

@ferdiko
Copy link

ferdiko commented Oct 8, 2021

@Roshanson what did you end up doing?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants