combine_bidir causes batch to be mixed up #138

mingruimingrui · 2020-04-05T08:31:32Z

combine_bidir is a function in source/embed.py#L230.
It's used to concatenate the forward and backward hidden tensors from a bidirectional LSTM.

def combine_bidir(outs):
    return torch.cat([
        torch.cat([outs[2 * i], outs[2 * i + 1]], dim=0).view(1, bsz, self.output_units)
        for i in range(self.num_layers)
    ], dim=0)

Here outs is a tensor of the shape [num_dir * num_layers, bsz, hidden_size].
The goal is to combine the tensor to the form [num_layers, bsz, num_dir * hidden_size].

The error is clear as the inner concatenate function should join the tensors on the final dimension instead of the first. Significantly current version of the implementation would mix up tensor values between different entries in the same batch. A quick fix would be to change dim=0 to dim=-1.

def combine_bidir(outs):
    return torch.cat([
        torch.cat([outs[2 * i], outs[2 * i + 1]], dim=-1).view(1, bsz, self.output_units)
        for i in range(self.num_layers)
    ], dim=0)

However, the code is still rather convoluted and includes one too many for loop which hampers the readability of the code. I suggest using purely reshape and transpose operations for this task.

def combine_bidir(outs):
    # [num_layers * num_dir, bsz, hidden_size]
    #   -> [num_layers, num_dir, bsz, hidden_size]
    #   -> [num_layers, bsz, num_dir, hidden_size]
    #   -> [num_layers, bsz, num_dir * hidden_size]
    outs = outs.reshape(self.num_layers, 2, bsz, self.hidden_size)
    outs = outs.transpose(1, 2)
    return outs.reshape(self.num_layers, bsz, self.output_units)

The text was updated successfully, but these errors were encountered:

mingruimingrui mentioned this issue Apr 5, 2020

Fix combine_bidir function #139

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

combine_bidir causes batch to be mixed up #138

combine_bidir causes batch to be mixed up #138

mingruimingrui commented Apr 5, 2020

combine_bidir causes batch to be mixed up #138

combine_bidir causes batch to be mixed up #138

Comments

mingruimingrui commented Apr 5, 2020